Why Adaboost SAMME needs f to be estimable?

Question

I am trying to understand the mathematics behind SAMME AdaBoost:
https://web.stanford.edu/~hastie/Papers/samme.pdf
At some stage, the paper adds a constraint for f to be estimable:

I do not understand why this is required. Can someone explain a bit better why this restriction is needed?
As well, would be possible to use a different constraint than the one added in the paper that would make f estimable?

Nikos M. · Accepted Answer

Think for a while, if $f$ is not estimable, it can have any constant added to it with no difference on the result of the process.
This means that if no other constraints are imposed, $f$ is not well/uniquely defined and in fact represents a whole class of functionals. This obviously needs to be fixed and some natural constraint needs to be added (to uniquely fix $f$).
On the other hand, the additional constraint can also be seen as a free parameter of the process which nevertheless needs to be fixed somehow in a concrete instance of the process in order to take place.
In an analogy with physics, the algorithm is "gauge-invariant", but for any concrete physical problem to be solved some "gauge" needs to be chosen and fixed.
They choose to impose the symmetric constraint which reduces to the usual AdaBoost in the 2-class case.
One can impose another constraint on $f$ (than the symmetric one), choosing to satisfy other criteria if so desirable (as long as it can uniquely fix $f$).
For example, the general non-symmetric constraint is also valid:
$$f_1 + dots + f_K = c$$
for arbitrary constant $c$. This also fixes $f$ uniquely and it can introduce bias favoring certain classes over others (eg for an imbalanced problem). Additionaly, unless $c=0$, does not reduce to the (symmetric) AdaBoost in 2-classes case (which may or may not be desirable).

Why Adaboost SAMME needs f to be estimable?

One Answer

Add your own answers!

Ask a Question