How to specify K cluster in Hierarchical clustering with noisy data?

Cross Validated Asked by farzad on December 18, 2020

I’m new in Mining and Clustering and I wonder how to cut off the hierarchical clustering Dendrogram to obtain a specific number of clusters. The problem is here that the data is noisy and the SLINK algorithm consider these noises as a cluster and when I cut off the Dendrogram to obtain exactly K cluster, it gives me some noisy cluster and so ignores all or some of K expected clusters. So I think there should be some techniques to cut the Dendrogram without considering the noisy clusters.

Note that I know the number of K and it’s not a problem how to specify this number!


One Answer

Choose the height h such that there are k non-trivial (non noise) clusters.

Then cut the tree at this height.

Note that this may not be satisfiable. If your data is a single gaussian, it may not be possible to find k non-trivial clusters.

Answered by Has QUIT--Anony-Mousse on December 18, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP