TransWikia.com

ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets in CNN

Data Science Asked on June 14, 2021

I am fairly new to ML and CNN, and this is my first attempt. I have managed to get my model to run, and now I am trying to produce a confusion matrix and classification report, but I am receiving an error. I have put the code below for you to see. A bit of context on my dataset is a dataset of 9,339 images and classifying these images based on families they could belong to and in total there are 25 families/classes. The dataset is named Malimg.

The code:

import numpy as np
import graphviz
import keras
from sklearn.utils import class_weight
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation, Conv2D, MaxPooling2D, BatchNormalization
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from keras.preprocessing.image import ImageDataGenerator
from ann_visualizer.visualize import ann_viz
from matplotlib import pyplot as plt 
from sklearn.metrics import multilabel_confusion_matrix, classification_report, confusion_matrix


path_to_data = "malimg_dataset"


train_test_generate_batches = ImageDataGenerator()
batch = train_test_generate_batches.flow_from_directory(directory=path_to_data, target_size=(64,64), batch_size=10000)
imgs, labels = next(batch) #generates batches of data from the path of the directory 
X_train, X_test, y_train, y_test = train_test_split(imgs/255.,labels, train_size=0.7, test_size=0.3) #splits the dataset into training and testing samples with 30% of the overall samples being test data.


num_classes = 25


model = Sequential()


model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(16, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.25))
model.add(Dropout(0.5))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])


y_train_new = np.argmax(y_train, axis=1)
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train_new), y_train_new)


history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10,  class_weight=class_weights)
scores = model.evaluate(X_test, y_test)

model.summary()

Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(multilabel_confusion_matrix(y_test, Y_pred))
print('Classification Report')

The error I receive is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-f1724cc7d7f9> in <module>
      1 Y_pred = np.argmax(model.predict(X_test),axis=1)
      2 print('Confusion Matrix')
----> 3 print(multilabel_confusion_matrix(y_test, Y_pred))

~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/metrics/_classification.py in multilabel_confusion_matrix(y_true, y_pred, sample_weight, labels, samplewise)
    436 
    437     """
--> 438     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
    439     if sample_weight is not None:
    440         sample_weight = column_or_1d(sample_weight)

~/opt/anaconda3/envs/projectnew/lib/python3.6/site-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
     89     if len(y_type) > 1:
     90         raise ValueError("Classification metrics can't handle a mix of {0} "
---> 91                          "and {1} targets".format(type_true, type_pred))
     92 
     93     # We can't have more than one value on y_type => The set is no more needed

ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets

Could anyone give me any context into what is going on? Perhaps suggesting what the code should look like if possible. This is my first attempt but I am sure it has something to do with the fact I have 25 different classes and it isn’t liking the fact that I do. I initially was using confusion_matrix but then tried multilabel_confusion_matrix as I thought this may solve my issue but it did not.

One Answer

I figured it out. You have to convert tests labels in single-digits instead of one-hot encoding.

To achieve this I changed the confusion matrix code from:

Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(multilabel_confusion_matrix(y_test, Y_pred))
print('Classification Report')

To:

y_test_arg=np.argmax(y_test,axis=1)
Y_pred = np.argmax(model.predict(X_test),axis=1)
print('Confusion Matrix')
print(confusion_matrix(y_test_arg, Y_pred))

Answered by Jack on June 14, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP