TransWikia.com

What does the classification report interpret? Class 1 indicates abnormal data

Data Science Asked by prnai on February 4, 2021

How to interpret the report and How is precision, recall values are calculated for individual class labels. What is the significance of macro avg ? Does this report signify a good predictions by the model?

enter image description here

2 Answers

Classification report gives a perspective of your model performance. The 1st row shows the scores for class 0. The column 'support' displays how many object of class 0 were in the test set. The 2nd row provides info on the model performance for class 1.

Recall for each class is calculated as follows: True Positives/(True Positives + False Negatives) E.g., recall for class 1: 15/(15+27) = 0.56

Precision for each class is calculated as follows: True Positives/(True Positives + False Positives) E.g., precision for class 1: 15/(15+22) = 0.41

This particular classification report shows that the performance of the model is poor. Accuracy as a metric may be misleading. Although the model shows 99% accuracy, it hardly detects the object of class 1 (I suppose these are the object of interest, the abnormal data you want to detect).

One of the reasons for that may be the imbalance of the data, i.e. there are many objects of class 0 and few samples of class 1. As a result, the classifier is overfitting to class 0 to minimize error during training. Consider reading about some techniques to overcome the imbalance problem.

Answered by maya-ami on February 4, 2021

It looks like you are using an imbalanced dataset in your model. The results look poor with high false negative values/low sensitivity ratio. One metric you may want to use in addition is the AUC/ROC. This works well for comparing results for imbalanced data. There are many options to improve your model for a better sensitivity result, such as

  1. Using different weights for predicting a majority vs predicting a minority label
  2. Up-sample or down-sample the training data to help balance the prediction across minority and majority, or use SMOTE for datasets with few features.
  3. Choose a modeling algorithm that is better at handling imbalanced data, such as machine learning/neural networks, or a classical algorithm such as bagging or boosting algorithms and/or decision trees

Resources:

https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

https://www.analyticsvidhya.com/blog/2017/03/imbalanced-data-classification/

Answered by Donald S on February 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP