Data Science Asked on January 29, 2021
I have a large panel dataset composed of $N$ stocks, $T$ quarterly dates and $K$ features for each stock. The dataset looks like the following:
symbol stockPriceD numberOfShares marketCapitalization ... label
2002-06-01 A -4.91 1000000.0 -2.254640e+09 ... 0
2002-09-01 A -9.08 1000000.0 -4.203510e+09 ... 1
2002-12-01 A 4.27 0.0 1.985550e+09 ... 1
...
2009-06-01 BA 3.19 732600000.0 3.167762e+10 ... 1
...
2019-12-01 ZTS 10.43 -700000.0 4.896220e+09 ... 0
2020-03-01 ZTS -8.72 -2400000.0 -4.478504e+09 ... 0
I would like to do a forecast task on this dataset but I cannot assume independence among the features (almost all are autocorrelated), and splitting the dataset into $N$ different ones for each stock will leave me with very small datasets (max 72 instances/rows).
How can I handle this problem? Am I allowed to assume independence among the instances in any case forgetting about the autocorrelation? Are there Machine Learning algorithms that can handle these kinds of problems (panel data)?
I read about using RNN and LSTM algorithms to address these issues, but how the data should be treated?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP