Data Science Asked by Student of Statistics on June 27, 2021
I think I understand what pruning is (in concept) in a decision tree. What is not so clear to me (in reading) is what happens to the pruned observations. What do you do with them? Are they just simply not used and forgotten?
What bothers me is the "hole" left behind by the pruned node. What if a future $Y$ needs to be predicted at that node? Do you go to the parent node and treat that parent node as an end node and predict $Y$ by the data at that parent node (even though some data at the parent node continue to be divided in sub-nodes, other than the pruned node)?
Once you prune nodes, they get removed from the tree altogether, so you wouldn't be able to predict based on them.
Typically, hyperparaneter tuning is used to ensure that the right nodes are pruned. Obviously, there is always a trade-off.
Answered by Valentin Calomme on June 27, 2021
Pruning doesn't operate on individual leaves, it happens to splits: you decide not to make a split at a node after all, and so all the observations that would go through that node in the original tree will stay at that node, which is now a leaf.
Answered by Ben Reiniger on June 27, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP