TransWikia.com

Validation fraction parameter of sklearn's Gradient Boosted Classifier in use with time series data

Data Science Asked by Bach Pham on March 7, 2021

I’m looking at using sklearn’s Gradient Boosting Classifier (GBC) to predict the sign of stock returns. My question is regarding the parameter "validation_fraction" used for early stopping. Since I’m working with time series financial data, any bootstrapping/resampling etc where later observations are used as a training set to predict past observations would be problematic. For example when I use the RandomForestClassifier model I have to set bootstrap to False.

My question is, with this parameter, does the GBC use any resample and/or bootstrapping of any kind ? If so I would have to find a way to modify the source code to ensure the integrity of the training/validation set split.

Thanks.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP