Data Science Asked on June 14, 2021
I am fairly new to ML and CNN, and this is my first attempt. I have managed to get my model to run, and now I am trying to produce a confusion matrix and classification report, but I am receiving an error. I have put the code below for you to see. A bit of context on my dataset is a dataset of 9,339 images and classifying these images based on families they could belong to and in total there are 25 families/classes. The dataset is named Malimg.
The code:
import numpy as np
import graphviz
import keras
from sklearn.utils import class_weight
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation, Conv2D, MaxPooling2D, BatchNormalization
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from keras.preprocessing.image import ImageDataGenerator
from ann_visualizer.visualize import ann_viz
from matplotlib import pyplot as plt
from sklearn.metrics import multilabel_confusion_matrix, classification_report, confusion_matrix
path_to_data = "malimg_dataset"
train_test_generate_batches = ImageDataGenerator()
batch = train_test_generate_batches.flow_from_directory(directory=path_to_data, target_size=(64,64), batch_size=10000)
imgs, labels = next(batch) #generates batches of data from the path of the directory
X_train, X_test, y_train, y_test = train_test_split(imgs/255.,labels, train_size=0.7, test_size=0.3) #splits the dataset into training and testing samples with 30% of the overall samples being test data.
num_classes = 25
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(16, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.25))
model.add(Dropout(0.5))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
y_train_new = np.argmax(y_train, axis=1)
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train_new), y_train_new)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, class_weight=class_weights)
scores = model.evaluate(X_test, y_test)
model.summary()
Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(multilabel_confusion_matrix(y_test, Y_pred))
print('Classification Report')
The error I receive is:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-25-f1724cc7d7f9> in <module>
1 Y_pred = np.argmax(model.predict(X_test),axis=1)
2 print('Confusion Matrix')
----> 3 print(multilabel_confusion_matrix(y_test, Y_pred))
~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
70 FutureWarning)
71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72 return f(**kwargs)
73 return inner_f
74
~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/metrics/_classification.py in multilabel_confusion_matrix(y_true, y_pred, sample_weight, labels, samplewise)
436
437 """
--> 438 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
439 if sample_weight is not None:
440 sample_weight = column_or_1d(sample_weight)
~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
89 if len(y_type) > 1:
90 raise ValueError("Classification metrics can't handle a mix of {0} "
---> 91 "and {1} targets".format(type_true, type_pred))
92
93 # We can't have more than one value on y_type => The set is no more needed
ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets
Could anyone give me any context into what is going on? Perhaps suggesting what the code should look like if possible. This is my first attempt but I am sure it has something to do with the fact I have 25 different classes and it isn’t liking the fact that I do. I initially was using confusion_matrix but then tried multilabel_confusion_matrix as I thought this may solve my issue but it did not.
I figured it out. You have to convert tests labels in single-digits instead of one-hot encoding.
To achieve this I changed the confusion matrix code from:
Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(multilabel_confusion_matrix(y_test, Y_pred))
print('Classification Report')
To:
y_test_arg=np.argmax(y_test,axis=1)
Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(confusion_matrix(y_test_arg, Y_pred))
Answered by Jack on June 14, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP