how do the number of classes in an object detection model affect accuracy?

Question

If I have, say, a Yolo or RetinaNet Object Detection Model... if I train it with 10 vs 50 classes, (assuming 3000 training data images per class), will the model with 10 classes perform similarly to the model with 50 classes? Is there a 'soft limit' to the number of classes a model can successfully hold in 'weight memory' ?.
I notice for most COCO examples, the class # is set at 80. Is there deteriorating performance when they push that number up into the 200s?
Is there any benchmarks done on this type of question? I would assume its a well discussed problem, i.e. splitting object detection of many classes across multiple trained models or packing them all into a single model?

Soumya Kundu · Answer

The general consensus in machine learning problems is that it becomes tougher to get higher accuracy results when there is more data with more class splits. The simplest of examples would be cifar 10 and cifar 100. While they are practically the same models tend to vary very differently with respect to the efficiency.
The moment more classes are added, there is more variability under the final output. I am not aware of any soft limit though few papers might be there which focus on this. If you take a model A and train it on data with 10 classes and 50 classes, it is bound to perform better on the 10 classes provided there is an ideal set-up in both terms.
I have not come across any benchmark which says which is the ideal number of classes. Importing many models takes a lot of computational power therefore the question of many to one models varies mostly on the cost rather than any soft limit of the performance.

how do the number of classes in an object detection model affect accuracy?

One Answer

Add your own answers!

Ask a Question