Cross Validated Asked on December 13, 2021
Could someone explain the main pruning techniques for decision trees. So something like the 3 most common techniques with a short explanation of how they work.
I have looked online but this, surprisingly, doesnt seem to have been covered anywhere. A canonical answer for this I think would be good.
Before Random Forests and other Decision Tree ensemble methods became common, single decision trees were often over-grown, or grown to maximum depth, and then pruned back based on different criteria. As far as I'm aware, there are two main approaches.
Reduced error pruning is done by fusing two leaves together at their parent node if the fusion does not change the prediction outcome.
Cost-complexity pruning removes subtrees based upon a cost-complexity function that balances error rate and complexity of the tree. (You might think of this as a sort of regularization.) One method of cost-complexity pruning is Minimum Description Length which is an information theoretic cost function that determines the number of bits necessary to encode the decision tree plus the number of bits necessary to encode the errors for that tree. This method was used by J. Ross Quinlan in C4.5.
You can find a brief description of Decision Tree Pruning along with some additional references, here. If you do a Google search for "decision tree pruning", you will find many references that discuss the need for it and with a bit of digging you can find more technical, methodological explanations as well.
Answered by KirkD_CO on December 13, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP