Data Science Asked on February 26, 2021
I’m reading about how variants of boosting combine weak learners into final predication. The case I’m consider is regression.
In paper Improving Regressors using Boosting Techniques, the final prediction is the weighted median.
For a particular input $x_{i},$ each of the $mathrm{T}$ machines makes a prediction $h_{t}, t=1, ldots, T .$ Obtain the cumulative prediction $h_{f}$ using the T predictors: $$h_{f}=infleft{y in Y: sum_{t: h_{t} leq y} log left(1 / beta_{t}right) geq frac{1}{2} sum_{t} log left(1 / beta_{t}right)right}$$ This is the weighted median. Equivalently, each machine $h_{t}$ has a prediction $y_{i}^{(t)}$ on the $i$‘th pattern and an relabeled such that for pattern $i$ we have: $$
y_{i}^{(1)}<y_{i}^{(2)}<, ldots,<y_{i}^{(T)} $$ (retain the association of the $beta_{t}$ with its $y_{i}^{(t)}$). Then sum the $log left(1 / beta_{t}right)$ until we reach the smallest $t$ so that the inequality is satisfied. The prediction from that machine $mathrm{T}$ we take as the ensemble prediction. If the $beta_{t}$ were all equal, this would be the median.
An Introduction to Statistical Learning: with Applications in R: The final prediction is the weighted average.
As such, I would like to ask of the way of aggregation is mathematics-based, or because the researcher feels it’s reasonable.
Thank you so much!
The ISL description is of gradient boosting (regression, with mse as the loss function), not of AdaBoost. There, $lambda$ is constant, not weights for each tree. Since each tree is fitted to the residuals, we need to add the results to better approximate the true values, not average.
However, the title question is still an interesting one. It does seem probably mostly arbitrary, but at least some testing has been done, see e.g. "Experiments with AdaBoost.RT, an Improved Boosting Scheme for Regression" by Shrestha and Solomatine.
Correct answer by Ben Reiniger on February 26, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP