南方医科大学学报 ›› 2023, Vol. 43 ›› Issue (2): 271-279.doi: 10.12122/j.issn.1673-4254.2023.02.16

• • 上一篇    下一篇

基于人工智能技术的鼻咽癌风险预测模型的构建与评价

张浩轩,陆 进,蒋成义,方美芳   

  1. 蚌埠医学院人体解剖学教研室,数字医学与智慧健康安徽省重点实验室,第一附属医院;安徽 蚌埠 233030
  • 出版日期:2023-02-20 发布日期:2023-03-16

Construction and evaluation of an artificial intelligence-based risk prediction model for death in patients with nasopharyngeal cancer

ZHANG Haoxuan, LU Jin, JIANG Chengyi, FANG Meifang   

  1. Department of Human Anatomy, Anhui Provincial Key Laboratory of Digital Medicine and Smart Health, Bengbu Medical College, Bengbu 233030, China; First Affiliated Hospital of Bengbu Medical College, Bengbu 233030, China
  • Online:2023-02-20 Published:2023-03-16

摘要: 目的 利用人工智能(AI)技术筛选鼻咽癌(NPC)患者死亡的危险因素,并构建风险预测模型。方法 基于SEER数据库(1973~2015)NPC患者的临床数据;采用SPSS 25.0软件对数据进行处理,并按7∶3随机分为建模组和验证组;利用R软件对建模组数据采用极限梯度提升(XGBoost)、决策树(DT)、套索算法(LASSO)与随机森林(RF)等4种AI算法筛选NPC患者死亡的危险因素,并构建风险预测模型。用C-指数(C-index)、决策曲线分析(DCA)、受试者工作特征曲线(ROC)和校准曲线(CC)等4种方式对模型进行内部评价;利用验证组数据和搜集的临床数据对模型进行内部验证与外部验证。结果 共纳入2116例NPC患者的临床数据(建模组1484例;验证组632例);建模组数据筛选影响NPC患者死亡的危险因素有年龄、种族、性别、Stage_M、Stage_T和Stage_N,利用其构建NPC风险预测模型的内部评价的C-index为0.76、ROC曲线下面积AUC=0.74、DCA净获益率为9%~93%,内部验证的C-index为0.740、ROC曲线下面积AUC=0.749、DCA净获益率为3%~89%,且CC高度一致;外部验证数据的C-index为0.943;DCA净获益率为(3%~97%);ROC曲线下面积AUC=0.851;而CC显示预测值与真实值之间具有良好的一致性。结论 性别、年龄、种族以及TNM分期是NPC患者死亡的危险因素,而NPC风险预测模型具有准确性、一致性、区分性与实用性等价值。

关键词: 鼻咽癌;预测模型;列线图;危险因素;人工智能

Abstract: Objective To screen the risk factors for death in patients with nasopharyngeal carcinoma (NPC) using artificial intelligence (AI) technology and establish a risk prediction model. Methods The clinical data of NPC patients obtained from SEER database (1973- 2015). The patients were randomly divided into model building and verification group at a 7∶3 ratio. Based on the data in the model building group, R software was used to identify the risk factors for death in NPC patients using 4 AI algorithms, namely eXtreme Gradient Boosting (XGBoost), Decision Tree (DT), Least absolute shrinkage and selection operator (LASSO) and random forest (RF), and a risk prediction model was constructed based on the risk factor identified. The C-Index, decision curve analysis (DCA), receiver operating characteristic (ROC) curve and calibration curve (CC) were used for internal validation of the model; the data in the validation group and clinical data of 96 NPC patients (collected from First Affiliated Hospital of Bengbu Medical College) were used for internal and external validation of the model. Results The clinical data of a total of 2116 NPC patients were included (1484 in model building group and 632 in verification group). Risk factor screening showed that age, race, gender, stage M, stage T, and stage N were all risk factors of death in NPC patients. The risk prediction model for NPC-related death constructed based on these factors had a C-index of 0.76 for internal evaluation, an AUC of 0.74 and a net benefit rate of DCA of 9%-93%. The C-index of the model in internal verification was 0.740 with an AUC of 0.749 and a net benefit rate of DCA of 3%-89%, suggesting a high consistency of the two calibration curves. In external verification, the C-index of this model was 0.943 with a net benefit rate of DCA of 3%- 97% and an AUC of 0.851, and the predicted value was consistent with the actual value. Conclusions Gender, age, race and TNM stage are risk factors of death of NPC patients, and the risk prediction model based on these factors can accurately predict the risks of death in NPC patients.

Key words: nasopharyngeal carcinoma; predictive model; nomogram; risk factors; artificial intelligence