Interrater Agreement and Reliability of Observed Behaviors: Comparing Percentage Agreement, Kappa, Correlation Coefficient, ICC and G Theory

Cao, Qian

The full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period, even for Texas A&M users with NetID.

View/ Open

CAO-THESIS-2013.pdf (1.201Mb)

Date

2013-04-28

Author

Cao, Qian

Metadata

Show full item record

Abstract

The study of interrater agreement and itnerrater reliability attract extensive attention, due to the fact that the judgments from multiple raters are subjective and may vary individually. To evaluate interrater agreement and interrater reliability, five different methods or indices are proposed: percentage of agreement, kappa coefficient, the correlation coefficient, intraclass correlation coefficient (ICC), and generalizability (G) theory. In this study, we introduce and discuss the advantages and disadvantages of these methods to evaluate interrater agreement and reliability. Then we review and explore the rank across these five indices by use of frequency in practice in the past five years. Finally, we illustrate how to use these five methods under different circumstances and provide SPSS and SAS code to analyze interrater agreement and reliability. We apply the methods above to analyze the data from Parent-Child Interaction System of global ratings (PARCHISY), and conclude as follows: (1) ICC is the most often used method to evaluate interrater reliability in recent five years, while generalizability theory is the least often used method. The G coefficients provide similar interrater reliability with weighted kappa and ICC on most items, based on the criteria. (2) When the reliability is high itself, different methods provide consistent indication on interrater reliability based on different criteria. If the reliability is not consistent among different methods, both ICC and G coefficient will provide better interrater reliability based on the criteria, and they also provide consistent results.

Citation

Cao, Qian (2013). Interrater Agreement and Reliability of Observed Behaviors: Comparing Percentage Agreement, Kappa, Correlation Coefficient, ICC and G Theory. Master's thesis, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /149310.