南方医科大学学报 ›› 2021, Vol. 41 ›› Issue (9): 1374-1380.doi: 10.12122/j.issn.1673-4254.2021.09.12

• • 上一篇    下一篇

一致性评价系数应用于无序多分类资料的效果评价

梁绮红,陈昭宇,张 峥,黄 爽,安胜利   

  1. 南方医科大学公共卫生学院生物统计学系,广东 广州 510515;广州血液中心,广东 广州 510095
  • 出版日期:2021-09-20 发布日期:2021-09-30

Application of Coefficient for Evaluating Agreement in disordered multi-classification data

LIANG Qihong, CHEN Zhaoyu, ZHANG Zheng, HUANG Shuang, AN Shengli   

  1. Department of Biostatistics, School of Public Health, Southern Medical University, Guangzhou 510515, China; Guangzhou Blood Center, Guangzhou 510095, China
  • Online:2021-09-20 Published:2021-09-30

摘要: 目的 基于AC1系数的构建思想,建立一致性评价系数(CEA)在两评价者无序多分类结局的评价方法,并可避免Kappa系数的缺陷。方法 通过随机抽样生成诊断试验类型数据,使用Monte Carlo模拟,在样本量、指定事件在总体的占比、偶然评价率、类别数等不同参数组合下进行重抽样,比较Kappa系数、AC1系数和CEA系数的均方误、方差和方差的期望。通过从总体中随机抽样1000次得到CEA系数的分布描述。结果 偶然评价率的不一致会导致CEA系数的均方误波动较大。与Kappa系数相比,AC1系数和CEA系数在指定事件的占比为极端值的情况更为稳定。在小样本、偶然评价率不一致的情况下,Kappa系数的方差和方差的期望变大,CEA系数变化较小。大样本条件下,CEA系数近似服从正态分布。结论 Kappa系数、AC1系数、CEA系数均受偶然评价率的影响最大,样本量次之。针对无序多分类结局,CEA系数在不同的样本量、偶然评价率下具有更稳健的性质。

关键词: 一致性评价;CEA系数;AC1系数;Kappa;诊断试验

Abstract: Objective To assess the performance of the Coefficient for Evaluating Agreement (CEA) established based on AC1 coefficient in evaluating the consistency between two raters for disordered multi- classification outcome data in comparison with the Kappa coefficient. Methods The diagnostic test data generated by random sampling and Monte Carlo simulation were used for resampling with different parameter combinations (including sample size, proportion of specified events in the population, accidental evaluation rate and number of categories) to compare the mean square error, variance, and variance of the mean of Kappa, AC1 and CEA. The distribution description of CEA was obtained by random sampling for 1000 times from the population. Results The inconsistency of the incidental evaluation rate caused substantial fluctuation of the mean square error of CEA. Compared with the Kappa coefficient, AC1 and CEA was more stable when the population contained extreme proportions of the specified events. For small samples and inconsistent evaluation rates by chance, the variance and the expectation of variance became obviously expanded for Kappa coefficient and showed smaller changes for CEA. CEA showed nearly a normal distribution for a large sample size. Conclusion Kappa, AC1 and CEA are all the most strongly affected by the accidental evaluation rate, followed then by sample size. For disordered multi-classification outcome data, CEA is more robust against the variations of sample size and accidental evaluation rate.

Key words: agreement evaluation; Coefficient for Evaluating Agreement; AC1 coefficient; Kappa; diagnostic test