TransWikia.com

Relation between Cross Validation and Confidence Intervals

Data Science Asked on December 7, 2020

I’ve read from a source which I forgot where that ‘In cross validation, the model with best scores at 95% confidence interval is picked’.
But according to my stat knowledge, in order for CI (confidence interval) to works, you need normality assumption about the sampling statistics of the experiment.

But how come from that unknown source it seems to simply use results from each flow to construct the sample mean & the confidence interval. It seems to me that neither checking if central limit theorem testing at all. And it seems to me this is what people are doing as well:
i) automatically assume normality in sampling MEANS (instead of sampling distribution)
ii) CLT automatically satisfied.
May I know if it’s my misunderstanding or the industry is adopting a norm which is too loose? Thanks.

One Answer

It depends how on the confidence interval (CI) is generated. The most common method is on a sample mean with the assumption that the samples are drawn from a normal distribution . However, a CI can be generated from any statistic from observed data. An alternative method would be through bootstrapping, resampling the statistic, which does not require the normality assumption.

Answered by Brian Spiering on December 7, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP