How do we decide between XGBoost, RandomForest and Decision tree?

Question

What do we take into consideration while deciding which technique should be used when dealing with a particular dataset? I understand that there isn't any hard and fast rule to this. Do we use XGBoost only when there are a lot of features in the dataset and RandomForest for otherwise cases? Or are we suppose to hit and trial and find whichever gets us better results everytime?

BeamsAdept · Answer

Decision Tree is very useful if you want to be able to explain where your result comes from you can often print the tree and see how your model came to this answer.
Random Forest can also provide such information, but you'll have to browse all trees and make some "stats" into them, which is not as easy. But Random Forest often give better results than Decision Tree (except on easy and small datasets).
Finally, XGBoost could give a better result than Random Forest, if well-tuned, but you can't explain it easily. If you don't mind about results-explanation, I'd suggest you to try both XGBoost and RandomForest, with a bit on tuning, to see which one is best fitting on your dataset.

How do we decide between XGBoost, RandomForest and Decision tree?

One Answer

Add your own answers!

Ask a Question