TransWikia.com

What could be the problem leading to the result that a model can never perfectly overfit?

Data Science Asked by CrazyMageQi on May 23, 2021

I tried to fit my model on a small batch of 128 samples for binary classification. The model should be powerful enough as it has hundreds of thousands of parameters. It should be able to overfit to 100% accuracy. However, it only fts to 96% for the best. It is about the same as when I train it on 30,000 samples. So, I tried the following but all failed:

use a smaller batch of 16 samples, it still cannot overfit

use different optimizers, including Adam, SGD, Adagrad, even reset the optimizer every 1,000 epochs, not working

every epoch, only train on the samples that are misclassified, not working.

The problem should be with this network since another more basic neural network can 100% fit. This one can only 99.2% fit. The top layer is indeed sigmoid.

Anyone got any idea what could be the problem?

2 Answers

I would guess there are a few samples that aren't necessarily from the same distribution as the rest. I would try identifying outliers and removing them.

Answered by ARandomName on May 23, 2021

Although it is true that you have more parameters than samples, DNN trains all those parameters at the same time, which makes it harder to overfit. Try reducing the learning rate and use SGD with momentum = 0. In addition, don't forget to remove any type of regularisation.

I am assuming you want to keep using 128 samples and the network you have designed, but you can always reduce the number of parameters or use a standard network to test (ResNet, Inception, VGG). I would usually take enough samples for one batch when I want to overfit the network.

Anyhow, if the network achieves 96% I would start by reducing the learning rate.

Good luck

Answered by Isaac.casm on May 23, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP