南方医科大学学报 ›› 2022, Vol. 42 ›› Issue (7): 1075-1081.doi: 10.12122/j.issn.1673-4254.2022.07.17

• • 上一篇    下一篇

基于时空图卷积神经网络的蛋白质复合物识别方法

盛江明,薛 娟,李 鹏,伊 娜   

  1. 中南大学湘雅二医院临床护理教研室,湖南 长沙 410011;中南大学湘雅二医院超声诊断科,湖南 长沙 410011;中南大学湘雅三医院手术中心,湖南 长沙 410013;湖南中医药大学信息科学与工程学院,湖南 长沙 410208
  • 出版日期:2022-07-20 发布日期:2022-07-15

A protein complex recognition method based on spatial-temporal graph convolution neural network

SHENG Jiangming, XUE Juan, LI Peng, YI Na   

  1. Clinical nursing teaching and Research Office, The Second Xiangya Hospital of Central South University, Changsha 410011, China; Department of ultrasound diagnosis, The Second Xiangya Hospital of Central South University, Changsha 410011, China; Operation center, The Third Xiangya Hospital of Central South University, Changsha 410013, China; School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, China
  • Online:2022-07-20 Published:2022-07-15

摘要: 目的 探讨利用时空图卷积神经网络在动态蛋白质网络中挖掘复合物的新方法。方法 文中首先定义了边强度、节点强度和边存在概率等指标对动态蛋白质网络进行建模,然后结合图上的时间序列信息和结构信息,基于希尔伯特-黄变换、注意力机制和残差连接等技术设计了2种卷积算子来对网络中蛋白质的特征进行表示学习,构建得到动态蛋白质网络特征图。最后采用谱聚类来识别复合物。结果 在多个公开生物数据集上的仿真实验结果表明,所提算法在DIP数据集和MIPS数据集上的F值都达到了90%以上,相比于DPCMNE、GE-CFI、VGAE和NOCD等4种识别算法而言,识别效率分别平均提高了约34.5%、28.7%、25.4%和17.6%。结论 运用深度学习技术来处理动态蛋白质网络的性能表现良好,具有普适意义。

关键词: 动态蛋白质网络;蛋白质复合物;图卷积神经网络;卷积算子;谱聚类

Abstract: Objective To propose a new method for mining complexes in dynamic protein network using spatiotemporal convolution neural network. Methods The edge strength, node strength and edge existence probability are defined for modeling of the dynamic protein network. Based on the time series information and structure information on the graph, two convolution operators were designed using Hilbert-Huang transform, attention mechanism and residual connection technology to represent and learn the characteristics of the proteins in the network, and the dynamic protein network characteristic map was constructed. Finally, spectral clustering was used to identify the protein complexes. Results The simulation results on several public biological datasets showed that the F value of the proposed algorithm exceeded 90% on DIP dataset and MIPS dataset. Compared with 4 other recognition algorithms (DPCMNE, GE-CFI, VGAE and NOCD), the proposed algorithm improved the recognition efficiency by 34.5% , 28.7% , 25.4% and 17.6% , respectively. Conclusion The application of deep learning technology can improve the efficiency in analysis of dynamic protein networks.

Key words: dynamic protein network; protein complex; graph convolution neural network; convolution operator; spectral clustering