Data Science Asked by Raghuram on March 27, 2021
In the Keras blog on training convnets from scratch, the code shows only the network running on training and validation data. What about test data? Is the validation data the same as test data (I think not). If there was a separate test folder on similar lines as the train and validation folders, how do we get a confusion matrix for the test data. I know that we have to use scikit learn or some other package to do this, but how do I get something along the lines of class wise probabilities for test data? I am hoping to use this for the confusion matrix.
To get a confusion matrix from the test data you should go througt two steps:
For example, use model.predict_generator
to predict the first 2000 probabilities from the test generator.
generator = datagen.flow_from_directory(
'data/test',
target_size=(150, 150),
batch_size=16,
class_mode=None, # only data, no labels
shuffle=False) # keep data in same order as labels
probabilities = model.predict_generator(generator, 2000)
For example, compare the probabilities with the case that there are 1000 cats and 1000 dogs respectively.
from sklearn.metrics import confusion_matrix
y_true = np.array([0] * 1000 + [1] * 1000)
y_pred = probabilities > 0.5
confusion_matrix(y_true, y_pred)
Additional note on test and validation data
The Keras documentation uses three different sets of data: training data, validation data and test data. Training data is used to optimize the model parameters. The validation data is used to make choices about the meta-parameters, e.g. the number of epochs. After optimizing a model with optimal meta-parameters the test data is used to get a fair estimate of the model performance.
Correct answer by Pieter on March 27, 2021
For confusion matrix you have to use sklearn package. I don't think Keras can provide a confusion matrix. For predicting values on the test set, simply call the model.predict() method to generate predictions for the test set. The type of output values depends on your model type i.e. either discrete or probabilities.
Answered by enterML on March 27, 2021
Here is some code I tried and worked for me:
pred= model.predict_generator(validation_generator, nb_validation_samples // batch_size)
predicted_class_indices=np.argmax(pred,axis=1)
labels=(validation_generator.class_indices)
labels2=dict((v,k) for k,v in labels.items())
predictions=[labels2[k] for k in predicted_class_indices]
print(predicted_class_indices)
print(labels)
print(predictions)
You can then use:
print (confusion matrix(predicted_class_indices,labels)
Make sure you use shuffle=False
in your test generator (in my case it's validation generator) and reset it using validation_generator.reset()
before you make your predictions.
Answered by ranahd on March 27, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP