Journal of Southern Medical University ›› 2026, Vol. 46 ›› Issue (3): 693-706.doi: 10.12122/j.issn.1673-4254.2026.03.23

Previous Articles    

Identification of efferocytosis-related genes in osteoarthritis and prediction of traditional Chinese medicines based on bioinformatics and machine learning

Kelin XIANG1(), Xiaoyu ZHANG2, Zhengpeng LI1, Zhiwei XU2, Sujie LIU1, Yuan CHAI2,3()   

  1. 1.Guangxi University of Chinese Medicine, Nanning 530001, China
    2.Ruikang Hospital Affiliated to Guangxi University of Chinese Medicine, Nanning 530011, China
    3.Nanjing University of Chinese Medicine, Nanjing 210023, China
  • Received:2025-08-14 Online:2026-03-20 Published:2026-03-26
  • Contact: Yuan CHAI E-mail:xiangkelin1024@163.com;chaizxy@163.com

Abstract:

Objective To screen key genes related to efferocytosis in osteoarthritis (OA) based on bioinformatics and machine learning methods, and explore their diagnostic value, immune microenvironment characteristics, and potential therapeutic targets of traditional Chinese medicines (TCM). Methods OA-related datasets GSE55235, GSE55457, and GSE117999 were obtained from the GEO database. An efferocytosis-related gene set was retrieved from GeneCards. Differential expression analysis was performed to identify OA-related differentially expressed genes (DEGs) and their intersection with efferocytosis-related genes, followed by GO and KEGG enrichment analyses. Three machine learning algorithms (Random Forest, LASSO regression, and SVM) were used to screen feature genes, and their diagnostic efficacy was evaluated using ROC curves. qRT-PCR was used to validate the feature gene expressions in a rat OA model. Immune cell infiltration was analyzed using CIBERSORT, GSEA was used to explore the related pathways, and the Coremine database was utilized to predict TCMs associated with the feature genes. Results A total of 959 OA-related DEGs were identified, including 15 efferocytosis-related genes, which were enriched in leukocyte migration, extracellular matrix, and inflammatory pathways. Machine learning identified 3 feature genes, namely UCP2, EGLN3, and IL1B, which showed good diagnostic performance in both the training (GSE55235) and validation sets (GSE55457 and GSE117999) and varying expression patterns in the mouse models. Immune infiltration analysis showed significant differences in resting mast cells, resting memory CD4⁺ T cells, and activated mast cells between OA patients and healthy controls. The feature genes were closely associated with the adipocytokine signaling pathway, sulfur metabolism, and spliceosome pathway. A total of 100 TCMs were predicted, which were primarily herbs for tonifying deficiency, clearing heat, and promoting blood circulation, such as Lycium barbarum, Epimedium brevicornu, Rehmannia glutinosa, Sophora flavescens, Ligusticum chuanxiong, and Achyranthes bidentata. Conclusion Efferocytosis-related genes play important roles in OA pathogenesis. UCP2, EGLN3, and IL1B have diagnostic value for OA. The predicted TCMs may serve as potential agents for OA prevention and treatment.

Key words: osteoarthritis, efferocytosis, bioinformatics, machine learning, prediction of traditional Chinese medicines