南方医科大学学报 ›› 2026, Vol. 46 ›› Issue (2): 466-472.doi: 10.12122/j.issn.1673-4254.2026.02.24

• • 上一篇    

缺失数据下一致性系数AC1不同处理方法的比较

李柯柯(), 徐利珊, 于米铼, 安胜利()   

  1. 南方医科大学公共卫生学院生物统计学系,广东 广州 510515
  • 收稿日期:2025-07-22 出版日期:2026-02-20 发布日期:2026-03-10
  • 通讯作者: 安胜利 E-mail:kk20001205@163.com;1069766473@qq.com
  • 作者简介:李柯柯,在读硕士研究生,E-mail: kk20001205@163.com
  • 基金资助:
    广东省基础与应用基础研究基金(2022A1515012152)

Comparison of missing data handling methods for AC1 coefficient estimation

Keke LI(), Lishan XU, Milai YU, Shengli AN()   

  1. Department of Biostatistics, School of Public Health, Southern Medical University, Guangzhou 510515, China
  • Received:2025-07-22 Online:2026-02-20 Published:2026-03-10
  • Contact: Shengli AN E-mail:kk20001205@163.com;1069766473@qq.com

摘要:

目的 通过模拟研究探讨不同缺失值处理方法对AC1系数(第一阶一致性系数)估计的影响,为实际应用提供参考。 方法 使用Monte Carlo模拟生成不同缺失机制下的无序评价数据,模拟参数包括评价者数量、类别数量、样本量、疾病流行率、偶然评价率和缺失比例。比较删除零评价受试者法、删除非完整评价受试者法、评价者众数填补法和受试者众数填补法4种缺失值处理方法,以偏差(bias)和均方误差(MSE)作为评价指标。 结果 在疾病流行率均衡或缺失机制为完全随机缺失(MCAR)/随机缺失(MAR)时,删除零评价受试者法表现最佳,在缺失比例低于30%时偏差和MSE近乎为0。而当流行率非均衡且存在非随机缺失(MNAR)时,受试者众数填补法更具优势,其偏差控制在±0.10以内,MSE保持在0.09以下,尤其在样本量充足且缺失比例不超过30%时MSE几乎为0。评价者众数填补法在所有场景中表现最差。删除非完整评价受试者法仅在2评价者2分类、低缺失比例且为MCAR/MAR时误差较小,其他场景下稳定性不足。 结论 不存在一种普遍最优的缺失值处理方法。在流行率均衡或可假设数据缺失机制为MCAR和MAR时,推荐删除零评价受试者法;在流行率非均衡且怀疑存在MNAR时,推荐受试者众数填补法。此外,建议研究者同时汇报多种方法下的AC1系数估计值以评估结果敏感性。

关键词: 一致性评价, 无序评价, AC1系数, 缺失数据

Abstract:

Objective To explore the impact of different missing data handling methods on AC1 coefficient estimation through simulation studies. Methods Monte Carlo simulation was used to generate evaluation data under different missing mechanisms. The parameters generated included the number of raters, categories, sample size, disease prevalence, random rating probability, and missing proportion. Four missing data handling methods, by excluding subjects with zero ratings, excluding subjects with incomplete ratings, rater mode imputation, and subject mode imputation, were compared using bias and mean squared error (MSE) as metrics. Results When disease prevalence was balanced or the missing data mechanism was missing completely at random (MCAR) or at random (MAR), excluding subjects with zero ratings showed the best performance, with bias and MSE close to zero at a missing proportion below 30%. Under skewed prevalence and missing not at random (MNAR), subject mode imputation was superior for AC1 coefficient estimation, resulting in a bias within ±0.10 and an MSE below 0.09; for a sufficient sample size and a missing proportion ≤30%, the MSE of this method was nearly zero. Rater mode imputation showed the worst performance across all these scenarios. Excluding subjects with incomplete ratings resulted in an acceptable error only in relatively simple settings (two raters and two categories) with low a missing proportion under MCAR/MAR, but showed a poor stability in other scenarios. Conclusion No universally optimal method exists for handling missing data in AC1 estimation. We recommend excluding subjects with zero ratings for balanced prevalence or MCAR/MAR, and subject mode imputation for skewed prevalence under MNAR. Researchers should report AC1 estimates from multiple methods to allow assessment of result sensitivity.

Key words: agreement evaluation, nominal ratings, AC1 coefficient, missing data