TransWikia.com

How many instances should be synthesized for each class when using over-sampling techniques?

Data Science Asked by Zihe Liu on March 2, 2021

As for an imbalanced multi-class dataset, how many instances should be synthesized for each class if we use over-sampling techniques such as SMOTE?

For example, there is 4 class including ‘A’, ‘B’, ‘C’ and ‘D’. The number of samples for each class is as follows:

  • A: 5
  • B: 50
  • C: 100
  • D: 500

Are the following methods meaningful?

  1. Make all class contain 500 instances.
  2. Set a threshold and over-sample the class number of whose instances is smaller than it. For example, set threshold = 100, and make A and B to 100.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP