南方医科大学学报 ›› 2023, Vol. 43 ›› Issue (7): 1241-1247.doi: 10.12122/j.issn.1673-4254.2023.07.21

• • 上一篇    下一篇

预测重症缺血性脑卒中死亡风险的模型:基于内在可解释性机器学习方法

罗 枭,程 义,吴 骋,贺 佳   

  1. 海军军医大学卫勤系军队卫生统计学教研室,上海 200433
  • 出版日期:2023-07-20 发布日期:2023-07-20

An interpretable machine learning-based prediction model for risk of death for patients with ischemic stroke in intensive care unit

LUO Xiao, CHENG Yi, WU Cheng, HE Jia   

  1. Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
  • Online:2023-07-20 Published:2023-07-20

摘要: 目的 构建一种内在可解释性机器学习模型,即可解释提升机模型(EBM)来预测重症缺血性脑卒中患者一年死亡风险。方法 使用2008~2019年MIMIC-IV2.0数据库中符合纳排标准的2369例重症缺血性脑卒中患者资料,将数据集随机分成训练集(80%)和测试集(20%),构建可解释提升机模型评估疾病预后。通过计算受试者工作特征曲线下面积(AUC)来衡量预测效果,使用校准曲线及布里尔分数(Brier score)评价模型的校准程度,并绘制决策曲线反映临床净收益。结果 本研究所构建的可解释提升机具有良好的区分度、校准度和净收益率,其中模型预测重症缺血性脑卒中预后不良的AUC为0.857[95% CI(0.831,0.887)];校准曲线分析结果显示,可解释提升机模型的校准曲线最接近于理想曲线;决策曲线分析结果显示,当该模型预测概率阈值为0.10~0.80时,其预测净获益率最大。基于可解释提升机模型的前5个独立预测变量为年龄、SOFA评分、平均心率、机械通气、平均呼吸频率,其重要性得分从0.179~0.370。结论 建立了一个可解释提升机模型,该模型应用于预测重症缺血性脑卒中患者一年内死亡风险具有良好的表现,通过模型可解释性能帮助临床医生更好地理解结果背后的原因。

关键词: 重症缺血性脑卒中;内在可解释性机器学习;可解释提升机;死亡预测

Abstract: Objective To construct an inherent interpretability machine learning model as an explainable boosting machine model (EBM) for predicting one-year risk of death in patients with severe ischemic stroke. Methods We randomly divided the data of 2369 eligible patients with severe ischemic stroke in the MIMIC-IV(2.0) database, who were admitted in ICU in 2008 to 2019, into a training dataset (80%) and a test dataset (20%), and assessed the prognosis of the patients using the EBM model. The prediction performance of the model was evaluated by calculating the area under the receiver operating characteristic (AUC) curve. The calibration curve and Brier score were used to evaluate the degree of calibration of the model, and a decision curve was generated to assess the net clinical benefit. Results The EBM model constructed in this study had good discrimination power, calibration and net benefit, with an AUC of 0.857 (95% CI: 0.831-0.887) for predicting prognosis of severe ischemic stroke. Calibration curve analysis showed that the standard curve of the EBM model was the closest to the ideal curve. Decision curve analysis showed that the model had the greatest net benefit rate at the prediction probability threshold of 0.10 to 0.80. The top 5 independent predictive variables based on the EBM model were age, SOFA score, mean heart rate, mechanical ventilation, and mean respiratory rate, whose significance scores ranged from 0.179 to 0.370. Conclusion This EBM model has a good performance for predicting the risk of death within one year in patients with severe ischemic stroke and allows clinicians to better understand the contributing factors of the patients' outcomes through the model interpretability

Key words: severe ischemic stroke; inherent interpretability machine learning; explainable boosting machine; mortality prediction