Data Science Asked on July 25, 2021
I’m using a multiclass dataset (cic-ids-2017), which is very imbalanced. I have already encoded the categorical feature (which is the target
) using OneHotEncoder
.
I tried to use SMOTE
oversampling method to balance the data with pipeline:
X = df.drop(['Label'],1)
y = df.Label
steps = [('onehot', OneHotEncoder()), ('smt', SMOTE())]
pipeline = Pipeline(steps=steps)
X, y = pipeline.fit_resample(X, y)
When I used pd.get_dummies instead of OneHotEncoder
, in this case I could not use the pipeline (because of get_dummies
).
How can I balance the dataset using SMOTE
? (and use get_dummies
for One Hot Encoding)
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP