Cross Validated Asked by TheUndecided on November 29, 2020
After K-means, I am iterating over many files consists of tables, mostly are 3X3. Some of them with "0" elements inside cells but most of them aren’t. I read that chi2_contingencycouldn’t be used if zero appears inside one or two of the cells. What method should I use if for example I am trying to find where df1 is differently distributed than df3? // or df2 different from df1.. (using python) E.g of tables:
Cluster |df1 |df2 |df3
0 |14 |20 |100
1 |1 |3 |75
2 |0 |1 |12
3 |2 |2 |48
Cluster |df1 |df2 |df3
0 |9 |2 |32
1 |3 |3 |4
2 |5 |21 |199
Pard of code chisquare related (no errors but isn’t the right calculation):
e2 = frame[['df1', 'df2', 'df3']]
stat, p, dof, expected = chi2_contingency(e2)
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP