Data Science Asked on March 10, 2021
Say I want to do features selection on a sparse matrix, i.e., 10,000 rows x 1500 features, but the matrix is mostly sparse. Let’s say the features are all numeric and the target is binary and discrete.
What’s the correct and efficient way to apply feature selection? Moreover, I’m interested in applying mutual information on it.
You can do a dimentionality Reduction as your matrix is Sparse. I would suggest to use PCA. PCA will reduce your 1500 input into k dimensional input of your choice with as much information retained as possible . Here k is a hyperparameter that you need to tune and fine the best one.
Another Approach is LASSO classfier which is a linear model with L1 regularization. This model will perform automatic feature selection and zero out the weight of feature that is not needed. But your input columns must be independent.
Answered by SrJ on March 10, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP