Data Science Asked by spectre on August 6, 2021
I am trying out hyperparameter tuning vs manually selecting the best parameter (max_leaf_nodes
) on a decision tree model with mean absolute error as the scoring. In theory, both should give me the same MAE and max_leaf_nodes
; but, both are giving me different MAEs. Also, if I change the value of cv in GridSearchCV I get different results. So basically I have two questions:
Why am I getting different max_leaf_nodes
and MAE in both cases?
How do I determine the value of cv in GridsearchCV, because I get different results for cv = 3, cv = 5, and cv = 10?
Your manual approach gives the MAE on the test set. Because you've set an integer for the parameter cv
, the GridSearchCV
is doing k-fold cross-validation (see the parameter description in grid search docs), and so the score .best_score_
is the average MAE on the multiple test folds.
If you really want a single train/test split, you can do that in GridSearchCV
, see e.g. this SO post.
Answered by Ben Reiniger on August 6, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP