Data Science Asked by Raydar on March 28, 2021
I have to clusterize this dataset in which I have houses and water consumption in this form:
$$
House1 = (x_{1},x_{2}… x_{n});
House2 = (y_{1},y_{2}… y_{n});
House3 = (z_{1},z_{2}… z_{n});
$$
where $x_{i}$ is the daily consumption in liters while $n$ is a fixed parameter (length of dataset).
I need to cluster these houses in k clusters based on their water consumption.
My question is: how can I handle data expressed in this form to feed in the clustering algorithm?
Maybe I will have to agglomerate each vector in some real value?
1.you just have to represent those features as numeric in a vector eg:[2,4,8,10]
2.Its a good practise to normalize vector i just took sum of elements and divide by each element by that sum of elements =[0.06666666666666667,0.13333333333333333,0.2,0.26666666666666666,0.3333333333333333] normalize the values in that vector to be between 0 -1
3.feed the vectors into clustering algorithm (you can try with kmeans)
Answered by Aj_MLstater on March 28, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP