南方医科大学学报 ›› 2026, Vol. 46 ›› Issue (4): 770-784.doi: 10.12122/j.issn.1673-4254.2026.04.06

• • 上一篇    下一篇

全球非酒精性脂肪性肝病负担的关键决定因素:基于GBD数据的机器学习联合孟德尔随机化验证

陈浩1,2,3(), 李振汉4,5, 纪梦佳6, 汪鑫诚7, 陈博峰6, 管谦1, 武嫚4, 卢林明1()   

  1. 1.皖南医科大学病理解剖学教研室,安徽 芜湖 241002
    2.暨南大学第一临床医学院,广东 广州 510632
    3.右江民族医学院基础医学院,广西 百色 533000
    4.皖南医科大学临床医学院,安徽 芜湖 241002
    5.广州医科大学第二附属医院,广东 广州 510260
    6.皖南医科大学公共卫生学院,安徽 芜湖 241002
    7.安徽医科大学第一临床医学院,安徽 合肥 241002
  • 收稿日期:2025-10-23 出版日期:2026-04-20 发布日期:2026-04-24
  • 通讯作者: 卢林明 E-mail:ha0chen@wnmc.edu.cn;llm7172@sina.com
  • 作者简介:陈 浩,博士后,副教授,E-mail: ha0chen@wnmc.edu.cn
  • 基金资助:
    国家自然科学基金(82500322);广西自然科学基金(2025GXNSFHA069262);国家级大学生创新创业训练计划(202510368012);安徽省大学生创新创业训练计划(S202410368116)

Key determinants of global burden of non-alcoholic fatty liver disease: machine learning combined with Mendelian randomization analysis based on GBD data

Hao CHEN1,2,3(), Zhenhan LI4,5, Mengjia JI6, Xincheng WANG7, Bofeng CHEN6, Qian GUAN1, Man WU4, Linming LU1()   

  1. 1.Department of Pathology, Wannan Medical University, Wuhu 241002, China
    2.First Affiliated Hospital of Jinan University, Guangzhou 510632, China
    3.School of Basic Medical Sciences, Youjiang Medical University for Nationalities, Baise 533000, China
    4.School of Clinical Medicine, Wannan Medical University, Wuhu 241002, China
    5.Second Affiliated Hospital of Guangzhou Medical University, Guangzhou 510260, China
    6.School of Public Health, Wannan Medical University, Wuhu 241002, China
    7.First School of Clinical Medicine, Anhui Medical University, Hefei 230031, China
  • Received:2025-10-23 Online:2026-04-20 Published:2026-04-24
  • Contact: Linming LU E-mail:ha0chen@wnmc.edu.cn;llm7172@sina.com
  • Supported by:
    National Natural Science Foundation of China(82500322)

摘要:

目的 分析非酒精性脂肪性肝病(NAFLD)的全球负担趋势、驱动因素及健康不平等,利用可解释机器学习识别关键死亡风险因素,并通过孟德尔随机化验证其潜在因果关联。 方法 基于全球疾病负担(GBD)2021数据,提取1990~2021年NAFLD的发病率、患病率、死亡率、残疾调整生命年(DALYs)等指标。运用Joinpoint回归分析趋势,分解分析法量化人口增长、老龄化和流行病学变化的贡献,集中指数评估健康不平等,XGBoost-SHAP机器学习识别死亡预测因子,并使用双样本孟德尔随机化对关键因子进行因果验证;分析按性别和社会人口指数(SDI)分层。 结果 全球NAFLD年龄标准化DALY率在男性(平均年度百分比变化[AAPC]=+0.34%)和女性(AAPC=+0.05%)中均呈上升趋势。分解分析显示,人口增长是全球DALYs增加的主要驱动力,而在高SDI地区,人口老龄化对男性死亡的贡献度达52.37%。健康不平等分析显示,2021年DALYs的集中指数为-0.05,负担向低SDI人群集中。机器学习识别吸烟(相对重要性=100%)和高龄(70~74岁:60%)为最关键死亡预测因素,模型测试集拟合优度良好(R²=0.98)。SDI分层分析显示吸烟和老龄化在不同SDI区域均位列前两位。孟德尔随机化进一步验证了吸烟起始(OR=1.35,P<0.05)与衰老(以衰弱指数代理,OR=2.01,P<0.05)与NAFLD风险间的正向因果关联。 结论 NAFLD负担沉重,存在性别与社会经济不平等。吸烟和高龄是关键风险因素,需制定整合烟草控制、老年健康管理与健康公平促进的针对性干预策略。

关键词: 非酒精性脂肪性肝病, 全球疾病负担, 机器学习, 孟德尔随机化, 社会人口指数

Abstract:

Objective To analyze the global trends, drivers, and health inequalities of non-alcoholic fatty liver disease (NAFLD) burden to identify key predictors of NAFLD-related mortality. Methods Using data from the Global Burden of Disease (GBD) Study 2021, we extracted global measures of NAFLD from 1990 to 2021, and the trends were analyzed using joinpoint regression. Decomposition analysis was used to quantify the contributions of population growth, aging, and epidemiological changes. The health inequality was assessed using the concentration index. Using XGBoost-SHAP machine learning, the mortality predictors were identified, and two-sample Mendelian randomization was employed to test the causality for the key factors. All the analyses were conducted with data stratification by sex and the socio-demographic index (SDI). Results The global age-standardized disability-adjusted life years (DALYs) rate showed an increasing trend in both males (average annual percentage change [AAPC]=+0.34%) and females (AAPC=+0.05%). Decomposition analysis revealed that population growth was the primary driver of the global increase in DALYs, while population aging contributed to 52.37% of male deaths in high-SDI regions. Health inequality analysis showed a concentration index of -0.05 for DALYs in 2021, indicating a concentration of burden among low-SDI populations. Machine learning identified smoking (relative importance=100%) and advanced age (70-74 years: 60%) as the most critical predictors of mortality, and the model demonstrated good fit on the test set (R2=0.98). SDI-stratified analysis showed smoking and aging are the top two predictors across all SDI regions. Mendelian randomization further confirmed positive causal associations of smoking initiation (OR=1.35, P<0.05) and aging (proxied by frailty index, OR=2.01, P<0.05) with NAFLD risk. Conclusion NAFLD burden is heavy globally with significant sex and socioeconomic inequalities. Smoking and advanced age are key risk factors for NAFLD, calling for integrated interventions for tobacco control, geriatric health management, and health equity promotion.

Key words: non-alcoholic fatty liver disease, Global Burden of Disease, machine learning, Mendelian randomization, socio-demographic Index