Random distribution which statistical method

Cross Validated Asked by TheUndecided on November 29, 2020

After K-means, I am iterating over many files consists of tables, mostly are 3X3. Some of them with "0" elements inside cells but most of them aren’t. I read that chi2_contingencycouldn’t be used if zero appears inside one or two of the cells. What method should I use if for example I am trying to find where df1 is differently distributed than df3? // or df2 different from df1.. (using python) E.g of tables:

Cluster |df1  |df2  |df3 
0       |14   |20    |100
1       |1    |3     |75
2       |0    |1     |12
3       |2    |2     |48

Cluster     |df1  |df2    |df3 
    0       |9    |2      |32
    1       |3    |3      |4
    2       |5    |21     |199

Pard of code chisquare related (no errors but isn’t the right calculation):

e2 = frame[['df1', 'df2', 'df3']]
            stat, p, dof, expected = chi2_contingency(e2)

chi squared contingency tables fishers exact python

Add your own answers!

Ask a Question

Get help from others!