Data Science Asked on February 3, 2021
I have a small data set (4000 records with 10 features) and I used XGBOOST in R as well as Boosted Decision Tree model in Azure ML studio. Unfortunately the results are different. I like to optimize recall and I could pick that as a measure in Azure but I can not do so in R.
I used the same parameters in both platforms. I know seeds might be different but I tried many of them. I always have a much better recall on my validation dataset using the Azure model compared to the R one.
I wonder if there is a big difference behind the methodology used in these two platforms causing me the issues.
I also used cross validation which did not help. Any insight is appreciated.
Thanks
It's hard to say, without being able to know exactly what Azure is doing.
tree_method='hist'
in xgb to be more similar there. max_depth=0
and grow_policy='lossguide'
, since you want to use max_leaves
instead for a direct comparison.https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-boosted-decision-tree#usage-tips
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-boosted-decision-tree#module-parameters
Answered by Ben Reiniger on February 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP