What's the most effective way to measure the accuracy of my multi-class classification NN?

Question

I'm relatively new to data science, and am confused about how to measure the accuracy of a multi-class neural network. The model I'm building is attempting to predict the outcome of a given event with 20 different outcomes - there is some imbalance in the labels, the most frequency label is around 15% of the total, and the least frequent is around 1%. The event I'm attempting to classify is difficult to predict, and I'll be happy with around 20% accuracy over all events - however I'm worried that when I achieve that, then my model will simply be predicting the most frequent class all the time, and just getting lucky on enough of the others to bring overall accuracy up to 20%. I'm using keras' 'categorical cross-entropy' loss function and 'accuracy' metric - are these the most appropriate for the problem? Which other metrics are best suited to a problem like this? Very happy to be referred to literature to help build my understanding of this problem.
I understand the confusion matrix for binary classification, however for multi-class classification, however I'm at a loss (no pun intended) as to how to apply these principles to a multi-class problem, whilst ensuring my model isn't just predicting the most popular class and getting lucky on enough others to meet my target.
Thanks in advance for any assistance!

wacax · Accepted Answer

Keras has some accuracy metrics for multiclass neural networks.

CategoricalAccuracy
Top K  Categorical Accuracy
Sparse Top K Categorical Accuracy

There are some others if you want to implemented from scratch. Taken from: https://www.sciencedirect.com/science/article/abs/pii/S0306457309000259

These work for most problems rather nicely combined with a good cross validation approach and leak prevention.
Having an imbalanced dataset always makes it challenging but picking a metric that doesn't punish results too much like Top K Categorical Accuracy combined with n-fold cross validation like this article suggests, is often enough to have a solid evaluation procedure that you can work on.

What's the most effective way to measure the accuracy of my multi-class classification NN?

One Answer

Add your own answers!

Ask a Question