TransWikia.com

CNN implementation low accuracy on MINST data

Data Science Asked by bishopqpalzm on May 6, 2021

I’m trying to implement VGG11 (Model A of Table 1 from this article) on the MINST dataset but I’m getting ~10% train & test accuracy (as bad as random guessing). I had to resize the MINST data from 28×28 to 32×32 to fit the CNN architecture. This is what I did:

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten
from keras import optimizers, utils
from PIL import Image, ImageFilter
import numpy as np
import tensorflow as tf

# Preprocessing

x_size = 6000 # Changed to reduce training time 
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train_ = np.ndarray((x_size, 32, 32))
x_test_ = np.ndarray((x_test.shape[0], 32, 32))

# Resizing inputs to 32x32
for i in [0, x_size-1]:
    im = Image.fromarray(x_train[i], mode=None)
    im = im.resize((32, 32))
    x_train_[i] = np.array(im)
for i in [0,x_test.shape[0]-1]:
    im = Image.fromarray(x_test[i], mode=None)
    im = im.resize((32, 32))
    x_test_[i] = np.array(im)

x_train_ = x_train_.reshape(x_train_.shape[0], 32, 32, 1)
x_test_ = x_test_.reshape(x_test_.shape[0], 32, 32, 1)

y_train = utils.to_categorical(y_train,10)
y_test = utils.to_categorical(y_test,10)
y_train_ = y_train[:x_size]


# Model A (VGG11) of Table 1: ConvNet configurations from paper arXiv:1409.1556v6

model = Sequential()
model.add(Conv2D(64, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same', input_shape=(32, 32, 1), data_format='channels_last'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(128, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(256, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(Conv2D(256, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Model compilation

model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(lr=0.001, momentum=0.9, nesterov=True, clipnorm=1.), metrics=['accuracy'])

# Model fitting

model.fit(x_train_, y_train_, epochs=1, batch_size=32)

# Model evaluation

score = model.evaluate(x_train_, y_train_)
print('Train loss after 1 epoch:', score[0])
print('Train accuracy after 1 epoch:', score[1]) 

I’ve tried normalizing the input, changing training sizes, increasing epochs, changing FC/filter size, and changing optimizers (and learning rate). Train accuracy is as low from both the evaluation report and TensorFlow’s History report. I’m expecting >95% accuracy. What am I doing wrong?

2 Answers

Try by adding dropout to the network to avoid overfitting. read the docs for more information https://keras.io/layers/core/

and try these things as well

since the targets are integers,its better to use sparse_categorical_crossentropy than categorical_crossentropy and optimizer as Adam

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.Adam(lr=0.001), metrics=['accuracy'])

and try by using sigmoid activation function for output layer

model.add(Dense(10, activation='sigmoid'))

Answered by Rajith Thennakoon on May 6, 2021

What did you increased the epochs to? You are trying to retrain VGG11 from scratch and it has over 30million parameters, which is expected to take a long time. Are you trying to use transfer learning, taking the pre-trained weights and freezing all the layers but the last one to use for your classification problem? In that case you are right to expect over %95 accuracy after a few epochs.

I don't know where to find the pre-trained VGG11 for TensorFlow by here is the one for PyTorch.

Answered by serali on May 6, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP