Data Science Asked by Noppawee Apichonpongpan on May 3, 2021
I have a dendrogram represented in a format I don’t understand:
(K_5:1.000030e+00,((K_1:2.000000e-05,(K_2:1.000000e-05,K_3:1.000000e-05):1.000000e-05):1.000000e-05,K_4:3.000000e-05)0.806:1.000000e+00):0.000000e+00;
I am not sure how to interpret the above.
It is an output of hierarchical clustering.
K_1, K_2, K_3, K_4, K_5 are the data points.
I have other dendrograms represented in the following format:
[x_1,x_2,x_3,x_4,x_5] (we start with one big cluster and split a cluster at each step)
[x_1,x_2][x_3,x_4,x_5]
[x_1,x_2][x_3,x_5][x_4]
[x_1][x_2][x_3,x_5][x_4]
[x_1][x_2][x_3][x_5][x_4]
I want a way to convert between these two representations.
This output represents the dendogram as a tree. The innermost parentheses represent the deepest parts of the tree. For instance the top (root) of the tree start with the pair K5 and a subtree, then this subtree is made of another subtree and K4, and so on.
If we ignore the numerical values (distances I assume?) we have this:
(K_5,
(
(K_1,
(
K_2,K_3
)
)
K_4
)
)
Which represents this tree:
--------------------
| |
| -------------
K_5 | |
------- K_4
| |
K_1 -----
| |
K_2 K_3
Then it can be converted to the desired format:
[K_1 , K_2 , K_3 , K_4 , K_5]
[K_1 , K_2 , K_3 , K_4] [K_5]
[K_1 , K_2 , K_3] [K_4] [K_5]
[K_1] [K_2 , K_3] [K_4] [K_5]
[K_1] [K_2] [K_3] [K_4] [K_5]
Correct answer by Erwan on May 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP