TransWikia.com

What are the true error and the sample error?

Data Science Asked on April 20, 2021

I am a student and I am studying machine learning. I am focusing on the concept of evaluation of an hypotesis.

What I have seen is that there are two types of error: true error and sample error.

The true error of an hypotesis $h$ with respect to a target function $f$ and a distribution $D$ is the probability that an hypotesis $h$ misclassifies an instance $x$ drawn according to $D$, and it is computed as:

$error_D(h)=Pr_{xin D}[f(x)neq h(x)]$

while the sample error of an hypotesis $h$ with respect to a target function $f$ and data sample $S$ is the proportion of examples that $h$ misclassifies:

$error_S(h)=frac{1}{n}sum _{xin S}delta (f(x)neq h(x))$

where

$delta (f(x)neq h(x))=1$ if $f(x)neq h(x)$ and $0$ otherwise.

I ask this question because I have not clear what these errors are.

Moreover, I have seen that the true error cannot be computed, while we can compute only the sample error. I don’t understand why.

Can somebody please help me understand?

One Answer

The true error represents the probability that a randomly drawn instance from the entire distribution is misclassified while the sample error is the fraction of sample which is misclassified.

As true error represents entire population it becomes difficult to calculate hence we use sample to check our hypothesis and use evaluation methods to check it's confidence level. A sample might not be a true representation of population so the difference in results are sample error. We try different sampling methods so that there is no bias in choosing a sample like randomised and stratified sampling.

For more you can refer this

Correct answer by prashant0598 on April 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP