南方医科大学学报 ›› 2023, Vol. 43 ›› Issue (1): 76-84.doi: 10.12122/j.issn.1673-4254.2023.01.10

• • 上一篇    下一篇

扩展Cox模型在非线性生存资料分析中的预测能力比较

陈雨轩,韦红霞,潘建红,安胜利   

  1. 南方医科大学公共卫生学院生物统计学系,广东 广州 510515;国家药品监督管理局药品审评中心,北京 100022
  • 出版日期:2023-01-20 发布日期:2023-02-22

Comparison of prediction ability of two extended Cox models in nonlinear survival data analysis

CHEN Yuxuan, WEI Hongxia, PAN Jianhong, AN Shengli   

  1. Department of Biostatistics, School of Public Health, Southern Medical University, Guangzhou 510515, China; Center for Drug Evaluation, National Medical Products Administration, Beijing 100022, China
  • Online:2023-01-20 Published:2023-02-22

摘要: 目的 系统性地比较两类扩展Cox模型的预测能力,观察它们应用于非线性生存数据中的预测能力优劣。方法 通过蒙特卡罗模拟和实证研究从预测能力方面研究比较限制性立方样条Cox模型(Cox_ RCS),深度生存神经网络Cox模型(Cox_DNN)这两种方法的优劣;并以传统Cox模型(Cox)和随机生存森林(RSF)作为参照。其中预测的区分度评价指标采用一致性指数(C-index),该指标越大,模型预测能力越好;预测的校准度评价指标采用积分布莱尔评分(IBS),该指标越小,模型预测能力越好。结果 在数据满足比例风险的情况下,无论样本量和删失率大小,Cox_RCS的预测能力都是最好的。在数据不满足比例风险的情况下,Cox_DNN的预测能力在大样本(本文中≥500)、低删失(本文中<40%)时是最优的,其余情况Cox_RCS的预测能力优于其他模型。在实例数据中,Cox_RCS 的表现是最优。结论 在含有非线性关系的低维生存数据中,Cox_RCS 和Cox_DNN在预测能力上各有优劣。因此可根据实际数据条件选择合适的分析方法,传统的生存分析方法在特定条件下并不差于机器学习以及深度学习方法。

关键词: 生存分析;非线性关联;Cox模型;限制性立方样条;深度神经网络

Abstract: Objective To compare the predictive ability of two extended Cox models in nonlinear survival data analysis. Methods Through Monte Carlo simulation and empirical study and with the conventional Cox Proportional Hazards model and Random Survival Forests as the reference models, we compared restricted cubic spline Cox model (Cox_RCS) and DeepSurv neural network Cox model (Cox_DNN) for their prediction ability in nonlinear survival data analysis. Concordance index was used to evaluate the differentiation of the prediction results (a larger concordance index indicates a better prediction ability of the model). Integrated Brier Score was used to evaluate the calibration degree of the prediction (a smaller index indicates a better prediction ability). Results For data that met requirement of the proportion risk, the Cox_RCS model had the best prediction ability regardless of the sample size or deletion rate. For data that failed to meet the proportion risk, theprediction ability of Cox_DNN was optimal for a large sample size (≥500) with a low deletion (<40%); the prediction ability of Cox_RCS was superior to those of other models in all other scenarios. For example data, the Cox_RCS model showed the best performance. Conclusion In analysis of nonlinear low maintenance data, Cox_RCS and Cox_DNN have their respectiveadvantages and disadvantages in prediction. The conventional survival analysis methods are not inferior to machine learning or deep learning methods under certain conditions.

Key words: survival analysis; nonlinear correlation; Cox model; restricted cubic spline; deep neural network