TransWikia.com

How do feature selection on a sparse matrix?

Data Science Asked on March 10, 2021

Say I want to do features selection on a sparse matrix, i.e., 10,000 rows x 1500 features, but the matrix is mostly sparse. Let’s say the features are all numeric and the target is binary and discrete.

What’s the correct and efficient way to apply feature selection? Moreover, I’m interested in applying mutual information on it.

One Answer

You can do a dimentionality Reduction as your matrix is Sparse. I would suggest to use PCA. PCA will reduce your 1500 input into k dimensional input of your choice with as much information retained as possible . Here k is a hyperparameter that you need to tune and fine the best one.

Another Approach is LASSO classfier which is a linear model with L1 regularization. This model will perform automatic feature selection and zero out the weight of feature that is not needed. But your input columns must be independent.

Answered by SrJ on March 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP