TransWikia.com

How to deal with broad and narrow variance within classes in classification tasks

Data Science Asked on April 22, 2021

Let’s say I’m doing an animal image classification task (it doesn’t have to be image classification – this is just my example), and the training and test data is balanced across classes. The classes might be ['gorilla', 'giraffe', 'dog', 'donkey']. Now we all know that there is relatively a lot of variance within the 'dog' class compared to the other three classes.

So, is there any way one would treat this problem vs another problem where all classes have about the same amount of variance (where I might replace 'dog' with 'sheep' for instance)?

One Answer

You will want to have many diverse examples of the high variance classes and pick a model that has a high learning capacity. In other words, a large number of samples and a big model to capture the properties of the data.

After training the model, look at the confusion matrix by class for the hold-out dataset to see if the model is learning to generalize for all classes, including the high variance classes.

Answered by Brian Spiering on April 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP