TransWikia.com

Is there any library available for balancing imbalanced text dataset?

Data Science Asked by Ananthakrishnan M A on March 12, 2021

I have a text dataset similar to newsgroup dataset, the problem with the dataset is that it is highly imbalanced. So is there any readily built library that will do upsampling or downsampling with a function call?

Imbalanced dataset

2 Answers

from imblearn.over_sampling import ADASYN, SMOTE, RandomOverSampler
from imblearn.under_sampling import NearMiss, RandomUnderSampler

ros = RandomOverSampler(random_state=777)
X_ROS, y_ROS = ros.fit_sample(testing_tfidf, testing_target)

smt = SMOTE(random_state=777, k_neighbors=1)
rus = RandomUnderSampler(random_state=777)

Good article for reference

Correct answer by Adhira Deogade on March 12, 2021

Here is a good blog for handling this in R. Class imbalance

Answered by oaksandbrooms on March 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP