TransWikia.com

Stratified Sampling for XGboost

Data Science Asked by honeybadger on February 3, 2021

I have a multiclass-classification dataset with the target (dependent) variable highly imbalanced. While using the randomForest package in R, I usually use the parameters sampsize & strata to account for the imbalance in training data. Are there any similar options in xgboost package also?

Summary of the number of datapoints available in each class.

Factor 1 : 667
Factor 2 : 676
Factor 3 :7807
Factor 4 : 850

One Answer

In R, it's an option of the cross validation function : xgb.cv See the documentation here : https://www.rdocumentation.org/packages/xgboost/versions/0.4-4/topics/xgb.cv

Answered by lcrmorin on February 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP