Data Science Asked by Ved Gupta on April 14, 2021
I am training and predicting on the same data-set, but I want to perform 10-fold cross-validation and predict on the left out fold and thus predict on the whole data set. How can I do this?
The libraries which I am using are:
from sklearn import cross_validation
import xgboost as xgb
xgboost comes with an own cv method, see an example here
Answered by Regenschein on April 14, 2021
What you are doing is a typical example of k-fold cross validation.
XGBoost
is just used for boosting the performance and signifies "distributed gradient boosting".
First, run the cross-validation step:
kfld = sklearn.cross_validation.KFold(labels.size, n_folds=10)
Then, use the train and test indices in kfld
for constructing the XGBoost matrix and re-scaling weights by looping over them(the indices).
A very neat implementation has been given as a Kaggle example here.
So, cross validation is not done with the xgboost
package, it is done with the cross_validation
module of sklearn
, and then the gradient boosting is done on the indices of the k-fold validation variable results.
Answered by Dawny33 on April 14, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP