Data Science Asked on May 31, 2021
I wanted to know if I can use t-sne or PCA to reduce the number of classes depending on the similarity between them. For example, if I have 100 classes of 100 different animals and would like to put all the cats in a group and all the dogs in a group etc. (to get few groups of these 100 classes).
No. t-Distributed Stochastic Neighbor Embedding (t-SNE) and Principal Component Analysis (PCA) are dimension reduction techniques, aka fewer columns of a tidy dataframe.
Clustering will reduce the number of observations, aka fewer rows of a tidy dataframe. In particular, you might be looking for hierarchical clustering.
Correct answer by Brian Spiering on May 31, 2021
If you want to reduce the number of classes you are predicting over, then you could manually map them to a simpler set (i.e. map poodle, greyhound to dog ) OR if you don't have the domain knowledge you can cluster your data and predict the cluster instead of their original labels.
You could use PCA or t-SNE to reduce the number of dimensions before clustering. This is best practice if you have many features (~>100) as you will often run into the curse of high dimensionality.
Answered by Michael Higgins on May 31, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP