Data Science Asked by Vega on December 10, 2020

I have a relative small sample size (330 with 45 features) + it’s time series data.

I want to train my LightGBM regression model for best generalized RMSE score and want to use repeated CV. I use hyperopt to do hyperparameter optimization to optimize for lowest RMSE.

The last 2 months include data post-corona, so a standard k-fold CV would probably fail when testing on such data because there was quite a shift in the target variable y.

Standard ways to do repeated CV use resample / reshuffle which is not useable with time series data.

What is best practice in this case? How can I do repeated Cross Validation while working with time series data?

At the moment I do this for the fmin() function.

```
def lightgbm_cv_repeated(params):
params = {
'n_estimators': int(params['n_estimators']),
'max_depth': int(params['max_depth']),
'learning_rate': params['learning_rate'],
'min_child_samples': int(params['min_child_samples']),
'min_child_weight': params['min_child_weight'],
'feature_fraction': params['feature_fraction'],
'bagging_fraction': params['bagging_fraction'],
'bagging_freq': int(params['bagging_freq']),
'num_leaves': int(params['num_leaves']),
'max_bin': int(params['max_bin']),
'num_iterations': int(params['num_iterations']),
'objective': 'rmse',
}
scores = []
for i in range(5, 11):
print(i)
cvTSS = TimeSeriesSplit(max_train_size=None, n_splits=i)
model = lgb.LGBMRegressor(random_state=i, **params)
score = -cross_val_score(model, X=X, y=y, cv=cvTSS, scoring="neg_root_mean_squared_error", n_jobs=-1).mean()
scores.append(score)
result = mean(scores)
return result
```

Get help from others!

Recent Questions

- How can I transform graph image into a tikzpicture LaTeX code?
- How Do I Get The Ifruit App Off Of Gta 5 / Grand Theft Auto 5
- Iv’e designed a space elevator using a series of lasers. do you know anybody i could submit the designs too that could manufacture the concept and put it to use
- Need help finding a book. Female OP protagonist, magic
- Why is the WWF pending games (“Your turn”) area replaced w/ a column of “Bonus & Reward”gift boxes?

Recent Answers

- Lex on Does Google Analytics track 404 page responses as valid page views?
- haakon.io on Why fry rice before boiling?
- Jon Church on Why fry rice before boiling?
- Peter Machado on Why fry rice before boiling?
- Joshua Engel on Why fry rice before boiling?

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP