Data Science Asked by Code Now on July 31, 2021
I would like to optimize the hyperparameters C and Gamma of an SVC by using grid search for an unbalanced data set. So far I have used class_weights=’balanced’ and selected the best hyperparameters based on the average of the f1-scores.
However, the data set is very unbalanced, i.e. if I chose GridSearchCV with cv=10, then some minority classes are not represented in the validation data.
I’m thinking of using SMOTE, but I see the problem here that I would have to set k_neighbors=1 because in some minority classes there are often only 1-2 samples.
Does anyone have a tip how to optimized the hyperparameters in this case? Are there any alternatives?
Many thanks for every hint
Scikit-learn's GridSearchCV uses StratifiedKFold so all classes will be proportional represented in the splits. GridSearchCV can be used for hyperparameter search.
Imbalanced-learn's SMOTE can also be used. If there are fewer samples than k
, it will only use available samples.
Answered by Brian Spiering on July 31, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP