南方医科大学学报 ›› 2026, Vol. 46 ›› Issue (1): 141-149.doi: 10.12122/j.issn.1673-4254.2026.01.15

• • 上一篇    下一篇

基于欠采样的影像组学机器学习模型术前预测子宫肌瘤高强度聚焦超声消融效果

崔运能1,2(), 冯敏清3,4,5, 姚亮凤2, 严杰文2, 李闻瀚6, 黄燕平6()   

  1. 1.暨南大学附属第一医院,放射科,广东 广州 510630
    2.佛山市妇幼保健院 放射科,广东 佛山 528000
    3.暨南大学附属第一医院,妇科,广东 广州 510630
    4.广州市第一人民医院 妇科,广东 广州 510180
    5.佛山市妇幼保健院 妇科,广东 佛山 528000
    6.佛山大学物理与光电工程学院,广东 佛山 528231
  • 收稿日期:2025-06-20 出版日期:2026-01-20 发布日期:2026-01-16
  • 通讯作者: 黄燕平 E-mail:letitb@163.com;yale.huangyp@fosu.edu.cn
  • 作者简介:崔运能,在读博士研究生,E-mail: letitb@163.com
  • 基金资助:
    广东省医学科研基金(B2019161);佛山市医学影像精准诊断工程技术研究中心(FS0AA?KJ819?4901?0049)

Enhancement of radiomics-based machine learning models for predicting efficacy of high-intensity focused ultrasound ablation of uterine fibroids using undersampling methods

Yunneng CUI1,2(), Minqing FENG3,4,5, Liangfeng YAO2, Jiewen YAN2, Wenhan LI6, Yanping HUANG6()   

  1. 1.Department of Radiology, First Affiliated Hospital of Jinan University, Guangzhou 510630, China
    2.Department of Radiology, Foshan Women and Children Hospital, Foshan 528000, China
    3.Department of Gynecology, First Affiliated Hospital of Jinan University, Guangzhou 510630, China
    4.Department of Gynecology, Foshan Women and Children Hospital, Foshan 528000, China
    5.Department of Gynecology, Foshan Women and Children Hospital, Foshan 528000, China
    6.School of Physics and Optoelectronic Engineering, Foshan University, Foshan 528000, China
  • Received:2025-06-20 Online:2026-01-20 Published:2026-01-16
  • Contact: Yanping HUANG E-mail:letitb@163.com;yale.huangyp@fosu.edu.cn

摘要:

目的 探讨不同欠采样方法在解决小样本数据类别不平衡问题中的应用,以提高机器学习模型术前预测子宫肌瘤高强度聚焦超声(HIFU)消融效果的准确性。 方法 收集在佛山市妇幼保健院就诊的140例HIFU治疗子宫肌瘤患者临床及影像学数据,其中高消融率组104例,低消融率组36例,提取患者MRI-T2WI影像组学特征,构建HIFU治疗机器学习预测模型。应用7种欠采样方法,即随机欠采样(RUS)、重复编辑最近邻(RENN)、全K最近邻(AllKNN)、近邻缺失-3(NM)、凝聚最近邻(CNN)、邻域清理规则(NCR)和实例硬度阈值(IHT),使用4种机器学习模型,即K最近邻(KNN)、随机森林(RF)、支持向量机(SVM)和多层感知机(MLP)共计构建28种预测模型处理类别不平衡数据,并通过5折交叉验证方法、以受试者工作特征曲线下面积(AUC)、准确率、召回率和特异性等评估各模型性能。 结果 欠采样方法与机器学习模型交叉组合的结果为:4种最佳组合AUC即CNN-RF为0.772(95%置信区间:0.566~0.942)、NM-SVM为0.797(95%置信区间:0.600~0.950)以及CNN-KNN和NM-MLP均为0.822(95%置信区间分别为0.635~0.964、0.632~0.960)。各机器学习模型的AUC在欠采样后均显著增高,其中以MLP模型改善最明显;各模型的召回率也显著增加,即CNN-RF召回率增加0.389、NM-SVM为0.836、CNN-KNN为0.532、NM-MLP为0.372。 结论 欠采样方法可有效解决小样本类别不平衡问题,为构建子宫肌瘤HIFU消融效果的机器学习预测模型提供新思路。

关键词: 子宫肌瘤, 磁共振成像, 高强度聚焦超声, 机器学习, 预测模型, 类别不平衡, 影像组学, 欠采样

Abstract:

Objective To improve the accuracy of machine learning models for preoperative prediction of high-intensity focused ultrasound (HIFU) ablation efficacy for uterine fibroids by correcting class imbalance in small sample datasets using undersampling methods. Methods Clinical and imaging data were collected from 140 patients with uterine fibroids undergoing HIFU treatment at Foshan Women and Children Hospital, including 104 with high ablation rates and 36 with low ablation rates. Radiomic features were extracted from MRI T2-weighted images (T2WI) of the patients, and machine learning models were constructed to predict HIFU treatment outcomes. Four machine learning algorithms, including k-Nearest Neighbors (KNN), Random Forest (RF), Support Vector Machine (SVM), and Multilayer Perceptron (MLP), were coupled with 7 undersampling methods, namely Random Undersampling (RUS), Repeated Edited Nearest Neighbors (RENN), All k-Nearest Neighbors (AllKNN), Neighborhood Cleaning Rule-3 (NM), Condensed Nearest Neighbor (CNN), Neighborhood Cleaning Rule (NCR), and Instance Hardness Threshold (IHT), for handling class imbalance in the datasets. The 28 prediction models were evaluated using 5-fold cross-validation for areas under the receiver operating characteristic curve (AUC), accuracy, recall, and specificity. Results The best combinations of undersampling methods and machine learning models CNN-RF, NM-SVM, CNN-KNN, and NM-MLP had AUCs of 0.772 (95% CI: 0.566-0.942), 0.797 (95% CI: 0.600-0.950), 0.822 (95% CI: 0.635-0.964), and 0.822 (95% CI: 0.632-0.960), respectively. The AUCs of the machine learning models significantly increased after coupling with undersampling methods, with the MLP model showing the most pronounced improvement. The recall rates of the 4 combined models also improved significantly (by 0.389 for CNN-RF, 0.836 for NM-SVM, 0.532 for CNN-KNN, and 0.372 for NM-MLP). Conclusion The use of undersampling methods can effectively correct class imbalance in small sample datasets to improve the accuracy of machine learning models for predicting the efficacy of HIFU ablation for uterine fibroids.

Key words: uterine fibroid, magnetic resonance imaging, high-intensity focused ultrasound, machine learning, prediction, class imbalance, radiomics, undersampling