Data Science Asked on July 21, 2021
I have some dummy variables that indicate the occurrence of an event. There is so many of them, so I used PCA on them, and it appears some of them are rather correlated together.
Would it be a good idea to use the PCA dimensions as input to a model?
Yes this is a very good idea, and often done this way when you have a lot of features. PCA is used to have less but useful features and train your model more efficiently (this is dimensionality reduction). Note that PCA is building new features from the ones you pass to it (it is different than feature selection). This will help you train your model faster and make it work better: it will generalize better. You still have to set how many of features you will keep to feed the model. You should keep meaningfull ones, and/or start by keeping 50 features and try less after.
Correct answer by Malo on July 21, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP