TransWikia.com

How to calculate the different metrics for multi class classification

Data Science Asked on February 16, 2021

My confusion matrix has the following structure:

                (Predicted)

C=   ( actual)    [TN FP
                 FN TP]

How can I calculate the Mathews Correlation Coefficient (MCC) value for multi-class expressed as
MCC = (TP .* TN - FP .* FN) ./ ... sqrt( (TP + FP) .* (TP + FN) .* (TN + FP) .* (TN + FN) );

Also, I have some doubts regarding the calculation of the following measures for multi-class. Please correct me where wrong.

for i=1:nClasses
   TN(i)=C(i,i);
    FP(i)=sum(C(i,:))-C(i,i);
    FN(i)=sum(C(:,i))-C(i,i);
    TP(i)=sum(C(:))-TP(i)-FP(i)-FN(i);
end

One Answer

As you can see this MCC formula is for binary classification, so you can only calculate its results by considering the problem as binary.

[edited to clarify OP's confusion] What is a confusion matrix? It shows for every true class $X$ as a row and every predicted class $Y$ as a column how many instances have true class $X$ and are predicted as $Y$. If there are only two classes (binary classification), the only possibilities are

  • $X$ positive and $Y$ positive -> TP
  • $X$ positive and $Y$ negative -> FP
  • $X$ negative and $Y$ positive -> FN
  • $X$ negative and $Y$ negative -> TN

However when there are more than two classes (multiclass classification) it's impossible to use this distinction positive/negative directly, so there are no general TP,FP,FN,TN cases.

With multiple classes one can calculate binary classification metrics for every class. This is done by considering the target class as positive and all the other classes as negative (as if they are merged into one big negative class).

Example: suppose we have classes A, B, C. If we focus on class A, the confusion matrix is like this:

    A   B   C
A   TP  FN  FN
B   FP  TN  TN
C   FP  TN  TN

to present it another way:

         A    B or C
A        TP    FN 
B or C   FP    TN  

Now if we focus on class B the confusion matrix becomes:

    A   B   C
A   TN  FP  TN
B   FN  TP  FN
C   TN  FP  TN

In your code the TP and TN categories are swapped:

TP(i)=C(i,i);
...
TN(i)=sum(C(:))-TP(i)-FP(i)-FN(i);

Correct answer by Erwan on February 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP