CNN Multi-Class Model Only Predicts 1 class for all test images

Question

I am trying to build a CNN model to predict 42 classes. I used pre-trained models for this. I used Xception.
This is how I have imported my dataset:
train_datagen = ImageDataGenerator(rescale =1.0/255.0, 
                                   zoom_range = 0.2, 
                                   shear_range = 0.2, 
                                   horizontal_flip = True,
                                  validation_split = 0.2)
training_data = train_datagen.flow_from_directory(train_path,
                                                  target_size = (299,299),
                                                  batch_size = 32,
                                                  class_mode = 'categorical',
                                                 subset = 'training')
validation_data = train_datagen.flow_from_directory(train_path, 
                                                   target_size = (299,299),
                                                   batch_size = 32,
                                                   class_mode = 'categorical',
                                                   subset = 'validation')

I then built my model:
import keras
prior = keras.applications.Xception(include_top = False, weights = 'imagenet', input_shape = (299,299,3))
model = Sequential()
model.add(prior)
model.add(Flatten())
model.add(Dense(256, activation = 'relu'))
model.add(Dropout(0.1))
model.add(Dense(42, activation = 'softmax', name = 'Output'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

My model performs quite well but I do not know why when I predict it on the test set, it predicts all the same class.
This is the epochs:
Epoch 1/3
2636/2636 [==============================] - ETA: 0s - loss: 1.9625 - accuracy: 0.4816
Epoch 00001: val_accuracy improved from -inf to 0.56412, saving model to xception.hdf5
2636/2636 [==============================] - 5041s 2s/step - loss: 1.9625 - accuracy: 0.4816 - val_loss: 1.6792 - val_accuracy: 0.5641
Epoch 2/3
2636/2636 [==============================] - ETA: 0s - loss: 1.4015 - accuracy: 0.6224
Epoch 00002: val_accuracy improved from 0.56412 to 0.64584, saving model to xception.hdf5
2636/2636 [==============================] - 5101s 2s/step - loss: 1.4015 - accuracy: 0.6224 - val_loss: 1.3240 - val_accuracy: 0.6458
Epoch 3/3
2636/2636 [==============================] - ETA: 0s - loss: 1.2084 - accuracy: 0.6747
Epoch 00003: val_accuracy did not improve from 0.64584
2636/2636 [==============================] - 4968s 2s/step - loss: 1.2084 - accuracy: 0.6747 - val_loss: 1.4000 - val_accuracy: 0.6373

However, when I predict its all one category:
['39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',
 '39',

Any help is appreciated. I would also like to know how I can use pre-trained models for these.

10xAI · Answer

73 millions trainable parms

- When using Transfer learning we first freeze the base model
- Train it till you reach good accuracy
- Then unfreeze it and train for just few epochs. Keep LR small
Other probable issues -
- If your labels are not One-Hot coded, please use sparse_categorical_crossentropy
- Add validation_split in fit method
- Suggest you add a keras.layers.GlobalAveragePooling2D after the base model and before flattening it

"setting include_top=False: this excludes the global average pooling layer and the dense output layer"

Can use this code as guidance
base_model = keras.applications.xception.Xception(weights="imagenet", include_top=False)
model = keras.layers.GlobalAveragePooling2D()(base_model.output)
output = keras.layers.Dense(n_classes, activation="softmax")(model)
model = keras.Model(inputs=base_model.input, outputs=output)

for layer in base_model.layers:
    layer.trainable = False

optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
history = model.fit(train_set, epochs=15, validation_data=valid_set)

for layer in base_model.layers:
    layer.trainable = True

optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
history = model.fit(train_set, epochs=5, validation_data=valid_set)

Code Ref - Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

vivek · Answer

Hope you are applying the preprocessing steps on the dataset that you are using for predict. I remember getting this kind of prediction log time back and that time I think it was something to do with either not applying the same preprocessing pipeline or incorrectly doing the label map
Since you are using image generator label mapping should be easy through
label_map = (train_generator.class_indices) label_map = dict((v,k) for k,v in label_map.items()) #flip k,v predictions = [label_map[k] for k in yFit] Here yFit is an array generated by model.predict()

CNN Multi-Class Model Only Predicts 1 class for all test images

2 Answers

Add your own answers!

Ask a Question