南方医科大学学报 ›› 2026, Vol. 46 ›› Issue (4): 939-945.doi: 10.12122/j.issn.1673-4254.2026.04.22

• • 上一篇    

基于实验室指标的Transformer模型可高效鉴别卵巢癌

谭顺谦1(), 卓俐1, 曾敏1, 黄方俊2, 朱君1, 蔡光瑶3, 甄鑫1()   

  1. 1.南方医科大学生物医学工程学院,广东 广州 510515
    2.广东省人民医院放疗科,广东 广州 519041
    3.中山大学肿瘤防治中心妇科,广东 广州 510080
  • 收稿日期:2025-09-03 出版日期:2026-04-20 发布日期:2026-04-24
  • 通讯作者: 甄鑫 E-mail:tann66643@gmail.com;xinzhen@smu.edu.cn
  • 作者简介:谭顺谦,在读硕士研究生,E-mail: tann66643@gmail.com
  • 基金资助:
    国家自然科学基金(82572381);国家自然科学基金(82404078);广东省自然科学基金(2024A1515012100);广东省基础与应用基础研究基金项目区域联合基金-青年基金项目(2023A1515110701)

A Transformer-based model using laboratory indicators efficiently differentiates ovarian cancer

Shunqian TAN1(), Li ZHUO1, Min ZENG1, Fangjun HUANG2, Jun ZHU1, Guangyao CAI3, Xin ZHEN1()   

  1. 1.School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
    2.Department of Radiation Oncology, Guangdong Provincial People's Hospital, Guangzhou 519041, China
    3.Department of Gynecology, Sun Yat-sen University Cancer Center, Guangzhou 510080, China
  • Received:2025-09-03 Online:2026-04-20 Published:2026-04-24
  • Contact: Xin ZHEN E-mail:tann66643@gmail.com;xinzhen@smu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(82572381)

摘要:

目的 评估一种基于 Transformer 深度学习模型、结合真实世界实验室检验指标的卵巢癌鉴别诊断性能。 方法 回顾性收集2012年1月1日~2021年4月4日同济妇产科医院收治的卵巢癌患者及良性病变患者的99项实验室检验指标和临床资料。通过基于ANOVA F检验的特征选择算法在训练集上选取20个关键特征,并采用表格数据转换器将每例患者转化为统一的嵌入向量。随后,利用改良的堆叠式Transformer对特征向量进行编码并训练模型。将该模型与多种传统机器学习方法进行比较,评价指标包括受试者工作特征曲线下面积(AUC)、准确率(ACC)、灵敏度(SEN)和特异度(SPE),通过五折交叉验证评估模型的泛化能力与有效性。 结果 五折交叉验证结果显示,基于Transformer深度学习模型在预测卵巢癌患者中表现最佳(AUC=0.931,ACC=0.813,SEN=0.833,SPE=0.865)。 结论 本研究提出的基于Transformer的模型在卵巢癌预测中具有较高的准确性和泛化能力,为卵巢肿瘤的临床辅助诊断提供了新的技术手段。

关键词: 卵巢癌, 实验室检验指标, Transformer, 嵌入向量, 深度学习

Abstract:

Objective To evaluate the diagnostic performance of a Transformer-based deep learning model that integrates real-world laboratory test indicators for differential diagnosis of ovarian cancer. Methods The clinical data and 99 laboratory test indicators were retrospectively collected from patients with ovarian cancer and benign ovarian lesions admitted to Department of Obstetrics and Gynecology of Tongji Hospital between January 1, 2012 and April 4, 2021. A feature selection algorithm based on ANOVA F-test was used on the training set to identify 20 key features. Each case was then converted into a unified embedded vector using a tabular data Transformer. An improved stacked Transformer model was then trained to encode these feature vectors. The proposed model was compared with multiple traditional machine learning methods. The evaluation metrics included the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. Five-fold cross-validation was performed to assess the generalization ability and robustness of the model. Results Five-fold cross-validation showed that the Transformer-based deep learning model achieved the best performance in predicting ovarian cancer with an AUC of 0.931, an accuracy of 0.813, a sensitivity of 0.833, and a specificity of 0.865. Conclusion The proposed Transformer-based model demonstrates high accuracy and generalization capability in predicting ovarian cancer, and may thus offer a assistance in clinical diagnosis of ovarian tumors.

Key words: ovarian cancer, laboratory test indicators, Transformer, embedded vector, deep learning