南方医科大学学报 ›› 2025, Vol. 45 ›› Issue (6): 1327-1335.doi: 10.12122/j.issn.1673-4254.2025.06.22

• • 上一篇    下一篇

基于特征解耦与融合的不完全多模态骨肿瘤图像分类

曾青海1,2(), 李传璞1,2, 阳维1,2, 宋丽文3, 赵英华3, 杨谊1,2()   

  1. 1.南方医科大学生物医学工程学院,广东 广州 510515
    2.广东省医学图像处理重点实验室,广东 广州 510515
    3.南方医科大学第三附属医院放射科,广东 广州 510630
  • 收稿日期:2024-09-20 出版日期:2025-06-20 发布日期:2025-06-27
  • 通讯作者: 杨谊 E-mail:qinghaizeng982@163.com;yiyang20110130@163.com
  • 作者简介:曾青海,在读硕士研究生,E-mail: qinghaizeng982@163.com
  • 基金资助:
    国家自然科学基金(82172020)

Incomplete multimodal bone tumor image classification based on feature decoupling and fusion

Qinghai ZENG1,2(), Chuanpu LI1,2, Wei YANG1,2, Liwen SONG3, Yinghua ZHAO3, Yi YANG1,2()   

  1. 1.School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
    2.Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou 510515, China
    3.Department of Radiology, Third Affiliated Hospital of Southern Medical University, Guangzhou 510630, China
  • Received:2024-09-20 Online:2025-06-20 Published:2025-06-27
  • Contact: Yi YANG E-mail:qinghaizeng982@163.com;yiyang20110130@163.com
  • Supported by:
    National Natural Science Foundation of China(82172020)

摘要:

目的 提出了一种基于特征解耦与融合的骨肿瘤分类模型,用于合理处理模态缺失并融合多模态信息,以提升分类准确率。 方法 设计解耦补全模块,先提取包含已有模态的局部与全局信息的骨肿瘤图像特征,再将该特征分解为共享特征和特定特征。利用共享特征作为缺失模态特征的补全表示,从而减少因模态差异带来的补全偏差。考虑到模态差异可能会使多模态信息难以融合,采用基于交叉注意力机制的融合模块。提升模型学习跨模态信息的能力并对特定特征进行充分融合,从而提高骨肿瘤分类的准确性。 结果 实验采用在南方医科大学第三附属医院收集的骨肿瘤数据集进行训练和测试。在7种可用模态组合中,本文方法中骨肿瘤分类的平均AUC、准确率、特异性分别为0.766、0.621、0.793,与现有的模态缺失处理方法相比分别提高了2.6%、3.5%、1.7%。全模态情况下骨肿瘤分类效果最佳,AUC为0.837;仅有MRI模态时AUC仍能达到0.826。 结论 本文方法能合理地处理模态缺失并有效融合多模态信息,在多种复杂的缺失情境下表现出良好的骨肿瘤分类性能。

关键词: 骨肿瘤分类, 多模态图像, 模态缺失, 特征解耦, 注意力融合

Abstract:

Objective To construct a bone tumor classification model based on feature decoupling and fusion for processing modality loss and fusing multimodal information to improve classification accuracy. Methods A decoupling completion module was designed to extract local and global bone tumor image features from available modalities. These features were then decomposed into shared and modality-specific features, which were used to complete the missing modality features, thereby reducing completion bias caused by modality differences. To address the challenge of modality differences that hinder multimodal information fusion, a cross-attention-based fusion module was introduced to enhance the model's ability to learn cross-modal information and fully integrate specific features, thereby improving the accuracy of bone tumor classification. Results The experiment was conducted using a bone tumor dataset collected from the Third Affiliated Hospital of Southern Medical University for training and testing. Among the 7 available modality combinations, the proposed method achieved an average AUC, accuracy, and specificity of 0.766, 0.621, and 0.793, respectively, which represent improvements of 2.6%, 3.5%, and 1.7% over existing methods for handling missing modalities. The best performance was observed when all the modalities were available, resulting in an AUC of 0.837, which still reached 0.826 even with MRI alone. Conclusion The proposed method can effectively handle missing modalities and successfully integrate multimodal information, and show robust performance in bone tumor classification under various complex missing modality scenarios.

Key words: bone tumor classification, multimodal imaging, modality missing, feature decoupling, attention fusion