Data Science Asked by Marni on July 3, 2021
I have a question about data simulation in Python. I deal with the classification of imbalanced data and want to test the effectiveness of different methods on simulated data. I have seen in various articles and books that the make_classification
function is used to generate data. Then the data is generated from a normal distribution, so the data is continuous and not discrete. Are such data correct for classification (SVM, Decision Trees) research?
There is no obstacle to doing this. For example you can create data by make_classification, and compare different algorithms by building model on it. You can also pass a random_state value to obtain same data each time you call the function. Both SVM, and Decision Trees can work with continuous data.
Answered by tkarahan on July 3, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP