Data Science Asked by quanty on December 21, 2020
I am looking at two documents to help me learn about constructing coincidence matrices in order to gain a better understanding of Krippendorff’s alpha. I am using these two:
There seems to me to be a discrepancy between the two. There probably isn’t, but I’m looking for some help in figuring out whether my understanding is wrong, or if there is indeed a discrepancy.
In link 1, I am looking at section B (“Nominal data, 2 observers, no missing data”), where the coincidence matrix is presented. In link 2, I am looking at the section “Coincidence matrices”.
Consider the reliability matrix presented in link 1:
In order to calculate the elements of the coincidence matrix, we have the following definition in link 2:
$$o_{vv’}=sum_{u=1}^{N}frac{sum_{ineq i’}^{m}I(v_{iu}=v)I(v_{i’u}=v’)}{m_u-1}=o_{v’v},$$
where $u$ is the horizontal element of the reliability matrix (the columns), $m_u$ is the number of labels actually present in column $u$.
This seems simple enough. For element $o_{aa}$ (or $o_{11}$) we should have:
$$
o_{aa}=frac{I(a=a)I(b=a)}{2-1}+frac{I(a=a)I(a=a)}{2-1}+frac{I(b=a)I(b=a)}{2-1}+frac{I(b=a)I(b=a)}{2-1}
$$
$$
+frac{I(d=a)I(b=a)}{2-1}+frac{I(c=a)I(c=a)}{2-1}+…
$$
and so on. Clearly, only one of the summation elements is non-zero, namely the second element. Hence $$o_{aa}=1.$$
Using the same formula/logic, we arrive at $$o_{bb}=2.$$
But if we look at link 1, it is getting double the value for the coincidence matrix elements, and I don’t understand why. I don’t even understand the link’s explanation as to why it is getting those values for the coincidence matrix.
Can somebody help?
The formula in the Wikipedia article seems to be limited to two coders. Look at the sum of values in u for m coders, i I like j, so each value in u is to be matched with one other value in u. Actually, this sum would pair values of i with values of j and values in j with values i. So, o sub aa would be 2 not 1. And o sub bb would be 4, not two.
However, I don’t see how Wikipedia’s formulation could yield coincidence matrices for more than two coders as this would require two sums which pair each of m valors with each of m-1 other values.
Answered by Klaus Krippendorff on December 21, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP