How can I find out what class each of the columns in the probabilities output correspond to using Keras for a multi-class classification problem?

Question

I'm using transfer learning to build an image recognition model using a pre-trained VGG network in Keras and excluding the final fully-connected layer to get the output weights.  I'm then using these output weights to feed into my new model which has a few layers along with a new fully-connected layer of my own that I'm training. The fully-connected layer maps to the number of output classes that I'm trying to predict.

Everything is working fine.  However, when I run:

results = model.predict(img_tensor)

I get output probabilities corresponding to each class, similar to the below:

print(results)

[[0.1426621  0.6193871  0.23795079]
 [0.11187755 0.6208466  0.2672758 ]
 [0.10050113 0.3768951  0.52260375]
 [0.1338948  0.59470254 0.27140263]
 [0.06612041 0.69726    0.2366195 ]
 [0.12080433 0.495977   0.38321865]]

My question is: How can I find out what class each of the columns in the probabilities output correspond to?

Does Keras have anything built-in to identify which column of the output probabilities corresponds to which class?  I would be shocked if nothing is provided...

What have others done to create a work-around?

Ryan Chase · Answer

The class_indices attribute in Keras’ flow_from_directory(directory) creates a dictionary of the classes and their index in the output array:

classes: optional list of class subdirectories (e.g. [‘dogs’, ‘cats’]). Default: None. If not provided, the list of classes will be
automatically inferred from the subdirectory names/structure under
directory, where each subdirectory will be treated as a different
class (and the order of the classes, which will map to the label
indices, will be alphanumeric). The dictionary containing the mapping
from class names to class indices can be obtained via the attribute
class_indices.

Source
I would be interested to see how people have incorporated this into their workflow/scripts though...

Aditya · Answer

You can then select the most probable classes using the probas_to_classes() utility function. Example:

y_proba = model.predict(x)
y_classes = keras.np_utils.probas_to_classes(y_proba)

This is equivalent to model.predict_classes(x) on the Sequential model.

When Using Generators,

generator= train_datagen.flow_from_directory("train", batch_size=batch_size)
 label_map = (generator.class_indices)

The label_map variable is a dictionary,
Then from this the relation can be derived between the probability scores and class names.

Basically, you can create this dictionary by this code.

from glob import glob
class_names = glob("*") # Reads all the folders in which images are present
class_names = sorted(class_names) # Sorting them
name_id_map = dict(zip(class_names, range(len(class_names))))

Hope this helps..

How can I find out what class each of the columns in the probabilities output correspond to using Keras for a multi-class classification problem?

2 Answers

Add your own answers!

Ask a Question