Comparing Multiclass classifiers with "No Answer"-Class

Question

I have three classifiers to classify some words into four classes. Every word that does not fit into any of these four classes gets classified as "No Answer". I would like to compare the classifiers with Precision, Recall, and F1-Score. Do I have to ignore the "No Answer" class to calculate the average Precision and so on or is it important to include it?

Paul · Answer

Precision, recall and F1 score are defined only for the binary case (2 classes), so if you want to apply that to the multiclass case, you need to apply a trick. A typical trick is to average the recall per class: Per class, you calculate which fraction of the words actually in that class are correctly classified. balanced_accuracy_score() in scikit-learn does that for you automatically.
Recall does not take into account false positives, so if you have words that where the model should say "no answer" (i.e. there are words where the ground truth is "no class"), then you should take that class into account. Otherwise it would benefit the model to just take a shot for all of the words it's given, and not classify any as "no class".

Comparing Multiclass classifiers with "No Answer"-Class

One Answer

Add your own answers!

Ask a Question