南方医科大学学报 ›› 2024, Vol. 44 ›› Issue (6): 1141-1148.doi: 10.12122/j.issn.1673-4254.2024.06.15

• • 上一篇    

慢性心力衰竭合并肺部感染患者院内死亡风险预测:基于可解释性机器学习方法

申采玉1(), 王帅2, 周锐盈1, 汪雨贺3, 高琴4, 陈兴智4, 杨枢1()   

  1. 1.蚌埠医科大学,卫生管理学院,安徽 蚌埠 233030
    2.蚌埠医科大学,公共卫生学院,安徽 蚌埠 233030
    4.蚌埠医科大学,基础医学院,安徽 蚌埠 233030
    3.蚌埠医科大学附属蚌埠第三人民医院急诊内科,安徽 蚌埠 233030
  • 收稿日期:2024-01-15 出版日期:2024-06-20 发布日期:2024-07-01
  • 通讯作者: 杨枢 E-mail:1224911076@qq.com;yangshu@bbmc.edu.cn
  • 作者简介:申采玉,在读硕士研究生,E-mail: 1224911076@qq.com
  • 基金资助:
    国家自然科学基金(81770297);安徽省临床医学研究转化专项(202304295107020079);蚌埠医科大学自然科学重点项目(2020byzd018);蚌埠医科大学研究生科研创新计划项目(Byycx23038)

Prediction of risk of in-hospital death in patients with chronic heart failure complicated by lung infections using interpretable machine learning

Caiyu SHEN1(), Shuai WANG2, Ruiying ZHOU1, Yuhe WANG3, Qin GAO4, Xingzhi CHEN4, Shu YANG1()   

  1. 1.School of Health Management, Bengbu Medical University, Bengbu 233030, China
    2.School of Public Health, Bengbu Medical University, Bengbu 233030, China
    4.School of Basic Medical Sciences, Bengbu Medical University, Bengbu 233030, China
    3.Department of Emergency Medicine, Bengbu Third People's Hospital, Bengbu Medical University, Bengbu 233030, China
  • Received:2024-01-15 Online:2024-06-20 Published:2024-07-01
  • Contact: Shu YANG E-mail:1224911076@qq.com;yangshu@bbmc.edu.cn
  • Supported by:
    National Natural Science Foundation of China(81770297)

摘要:

目的 使用可解释性机器学习方法预测慢性心力衰竭(CHF)合并肺部感染患者的院内死亡风险。 方法 回顾性分析MIMIC-IV数据库中诊断为CHF合并肺部感染的1415例患者病历信息。按病原体种类将患者划分为合并细菌性肺炎(841例)、合并非细菌性肺炎(574例)两个亚组,采用Kaplan-Meier生存曲线描述不同亚组的死亡风险差异。基于单因素分析和LASSO回归筛选特征。分别构建LR、AdaBoost、XGBoost、LightGBM模型,通过准确性、精确度、F1值、AUC等指标比较模型性能,使用eICU-CRD数据库进行外部验证。应用SHAP算法对XGBoost模型进行解释性分析。 结果 内部测试集中XGBoost模型预测CHF合并肺部感染患者院内死亡风险的准确性高于其他模型。外部测试集显示,合并细菌性肺炎、合并非细菌性肺炎两亚组中XGBoost模型的AUC值分别为0.691(95%CI:0.654~0.720)、0.725(95%CI:0.577~0.782)。相较于其他模型,XGBoost模型表现出了更好的预测能力和稳定性。 结论 在预测CHF合并肺部感染患者的院内死亡风险方面,XGBoost模型的综合表现优于其他3种模型。SHAP算法为模型提供了明确解释,有助于临床医生进行决策。

关键词: 慢性心力衰竭, 肺部感染, 预测模型, SHAP算法, 机器学习

Abstract:

Objective To predict the risk of in-hospital death in patients with chronic heart failure (CHF) complicated by lung infections using interpretable machine learning. Methods The clinical data of 1415 patients diagnosed with CHF complicated by lung infections were obtained from the MIMIC-IV database. According to the pathogen type, the patients were categorized into bacterial pneumonia and non-bacterial pneumonia groups, and their risks of in-hospital death were compared using Kaplan-Meier survival curves. Univariate analysis and LASSO regression were used to select the features for constructing LR, AdaBoost, XGBoost, and LightGBM models, and their performance was compared in terms of accuracy, precision, F1 value, and AUC. External validation of the models was performed using the data from eICU-CRD database. SHAP algorithm was applied for interpretive analysis of XGBoost model. Results Among the 4 constructed models, the XGBoost model showed the highest accuracy and F1 value for predicting the risk of in-hospital death in CHF patients with lung infections in the training set. In the external test set, the XGBoost model had an AUC of 0.691 (95% CI: 0.654-0.720) in bacterial pneumonia group and an AUC of 0.725 (95% CI: 0.577-0.782) in non-bacterial pneumonia group, and showed better predictive ability and stability than the other models. Conclusion The overall performance of the XGBoost model is superior to the other 3 models for predicting the risk of in-hospital death in CHF patients with lung infections. The SHAP algorithm provides a clear interpretation of the model to facilitate decision-making in clinical settings.

Key words: chronic heart failure, lung infection, predictive modeling, SHAP algorithm, machine learning