What does this paragraph about Stockfish's regression tests mean?

Question

Source
Point 5:

Elo estimates of single patches (SPRT runs) typically come with large error bars. Take this into account when adding Elo estimates. Furthermore, Elo estimates of passing patches are biased. The SPRT Elo estimates are only unbiased if one takes all patches into account, both passed and non-passed ones. As a result, the Elo gain measured by a regression test will typically be less than the sum of the estimated Elo gains of the individual patches since the previous regression test.

I don't understand the highlighted sentences. Can someone explain?

Oscar Smith · Answer

What this means is that if you assume every test has some noise, if you only look at the tests that turned out well, some of the reason they turned out well was likely that noise was in their favor. As such, you expect regression to the mean if you were to re-test only the passing patches. They would appear less good, because "passing test" is a biased sample.

Answered by Oscar Smith on December 22, 2021

What does this paragraph about Stockfish's regression tests mean?

One Answer

Add your own answers!

Ask a Question