Department of Communication Studies
Title of Conference Paper
For half a century, Cohen’s κ has been the most often used general indicator of reliability. It has been cited by more than three thousand journal articles between 1994 and 2009, according to Social Science Citation Index.
This article presents 14 paradoxes to show that κ is not a general indicator. An analysis of κ’s mathematics and underlying logic uncovers three assumptions: Each assessor predetermines a quota and faithfully enforces it. Assessors maximize chance diagnosing as the second priority. Assessors conduct honest diagnosing as the last priority. These assumptions have three implications: Assessors perform constrained task of assigning objects to categories predetermined by the quota. Assessors fix the distribution before diagnosing. Assessors apply variable benchmark depending on predetermined distribution.
These assumptions constitute boundaries beyond which κ should not be used. We show that the 14 paradoxes emerge because κ is used beyond its boundaries – when at least one of the three assumptions is violated.
We conclude that, when the assumptions do not hold, κ does not apply. As the assumptions rarely hold, we rarely should use κ, if ever.
The 61st Annual Conference of International Communication Association
Zhao, Xinshu (2011). “When to use Cohen’s κ, if ever?” Paper presented at the 61st annual conference of International Communication Association, Boston, USA, May