Machine Learning algorithms and Panel data

Question

I have a large panel dataset composed of $N$ stocks, $T$ quarterly dates and $K$ features for each stock. The dataset looks like the following:
            symbol  stockPriceD numberOfShares  marketCapitalization   ...    label
2002-06-01  A       -4.91       1000000.0       -2.254640e+09          ...    0
2002-09-01  A       -9.08       1000000.0       -4.203510e+09          ...    1
2002-12-01  A       4.27        0.0             1.985550e+09           ...    1 
...
2009-06-01  BA      3.19        732600000.0     3.167762e+10           ...    1
...
2019-12-01  ZTS     10.43       -700000.0       4.896220e+09           ...    0 
2020-03-01  ZTS     -8.72       -2400000.0      -4.478504e+09          ...    0

I would like to do a forecast task on this dataset but I cannot assume independence among the features (almost all are autocorrelated), and splitting the dataset into $N$ different ones for each stock will leave me with very small datasets (max 72 instances/rows).
How can I handle this problem? Am I allowed to assume independence among the instances in any case forgetting about the autocorrelation? Are there Machine Learning algorithms that can handle these kinds of problems (panel data)?
I read about using RNN and LSTM algorithms to address these issues, but how the data should be treated?

Machine Learning algorithms and Panel data

Add your own answers!

Ask a Question