TransWikia.com

Statistical significance test in deep learning for regression problems

Data Science Asked by Jorge Amaral on February 10, 2021

I was reading the tutorial “Statistical Significance Test for comparing ML algorithms”, where it suggests to use k-fold and apply the appropriate statistical test.

Suppose that I have a train set and a test set and two deep learning neural networks in a regression problem. Since the training in deep learning takes lots of time, the test procedure using k-fold would be very costly regarding computation resources.

I was wondering if it is possible to apply the statistical tests only in the results of the test set.

For example, if the test set has 1000 samples, the two neural networks would be trained in the same training set and then apply Wilcoxon in the results of the two networks over the 1000 points of the test set, is that correct? Or do I always need to perform the k-fold? Besides can I use the MSE in each test point and compare those results?

One Answer

You could use either the k-fold or the test-only.

The k-fold is a possibility which gives you more robustness because you tried on every possible combination of train/test, but you are still avoiding overfitting if you use the train/test partition.

Just make sure you use the same dataset in all the comparisons you make.

Answered by Juan Esteban de la Calle on February 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP