TransWikia.com

Problem with overfitting

Data Science Asked by paul95 on December 10, 2020

I make small CNN from scratch to classify barcodes. I have two classes: one for images with barcodes and second for all what isn’t barcodes (items, animals, landscape, furniture, people). I got good training accuaracy (80%+ after 50 epochs) but my validation accuracy is constant and around 50% and it is my main problem. I am aiming in around 80% for validation accuracy. I use some technique as dropout, decrease number of parameters, data augmentation, etc. My train set is 480 images (240 – barcodes, 240 the rest) and my validation set is 120 (6060). What should I do to achieve about 80% accuracy on the validation set. I use keras and tensorflow framework. Here my CNN:

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 1), padding='same'),

tf.keras.layers.MaxPool2D(3, 3),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPool2D(2, 2),
tf.keras.layers.BatchNormalization(),

tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPool2D(2, 2),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPool2D(2, 2),
tf.keras.layers.BatchNormalization(),

tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPool2D(2, 2),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.2),


tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy', optimizer="adam", metrics= 
['accuracy'])   

3 Answers

  • If your training accuracy is significantly higher than your validation accuracy, you are overfitting. You might want to try increasing the amount of dropout or using batch normalization.
  • Your test set is very small, leading to high variance in your results. Your model might actually be better than it seems due to this variance. Try cross validation so you can test on more examples.
  • Make sure your train and test set are randomly split. Otherwise your train and test examples might look different, leading to lower performance on the test set.
  • Do error analysis to try to identify what mistakes the model tends to make. Does it have trouble finding bar codes at certain angles, at a certain size, or on a certain background color? Error analysis by itself won't help but it might help you identify why your performance is worse than you want it to be.
  • Use a more powerful model (more parameters). Counterintuitively, bigger neural nets that have more potential to overfit sometimes generalize better than smaller neural nets. We don't yet understand why this is the case but the phenomenon is well known so you might want to see if it's the case here.

Answered by Reinstate Monica on December 10, 2020

Adding more things to what @Solomonoff'sSecret have mentioned:

  1. Instead of Dropout, try L2 regularization

model.add(Dense(64, input_dim=64, kernel_regularizer=regularizers.l2(0.01), activity_regularizer=regularizers.l1(0.01)))

  1. Remove dense layers from last and instead try global average pooling.

keras.layers.GlobalMaxPooling2D(data_format=None)

  1. Try different optimizer like SGD

  2. Try removing the last dropout, which is just before the softmax. I think it's forcing your network to a very large extent to generalize.

  3. Wild guess, accuracy in validation of 50% suggest model is predicting almost everything as a single class. Check quality of the image in the validation set and try to compare it with the training one. Use Cross-validation here instead.

Answered by shivam shah on December 10, 2020

I used transfer learning technique and as pre-trained network I chose InceptionV3. It was good idea. In addition I gathered more dataset (800/200). My validation accuracy is above 80 % now.

Answered by paul95 on December 10, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP