南方医科大学学报 ›› 2021, Vol. 41 ›› Issue (3): 439-446.doi: 10.12122/j.issn.1673-4254.2021.03.18

• • 上一篇    下一篇

6种蛋白质构建的预测模型在结直肠癌预后预测中的作用:基于TCPA数据库

温贺新,史维俊,葛思堂,李 静,左芦根,刘牧林   

  • 出版日期:2021-03-20 发布日期:2021-04-06

Value of prediction models for prognosis prediction of colorectal cancer: an analysis based on TCPA database

  • Online:2021-03-20 Published:2021-04-06

摘要: 目的 通过生物信息学方法,分析多种蛋白质联合预测结直肠癌(CRC)预后的作用及潜在的分子机制。方法 从癌症蛋白质组图集(TCPA)数据库下载CRC蛋白质表达数据及临床数据,应用Perl及R软件对数据进行整理后筛选出预后相关的蛋白质;进一步通过多因素Cox分析筛选出可作为CRC预后独立风险因子的蛋白质并据此构建预测模型。对模型中每一个蛋白质及模型风险评分进行生存分析,并对风险评分与患者生存状态绘制风险曲线验证预测模型对预后的预测作用。独立预后分析及ROC分析可反映预测模型在预后预测中的价值及优势。对模型蛋白质与CRC所有的相关蛋白质进行相互作用分析并在mRNA水平分析关键蛋白相关基因的差异表达。结果 通过单因素及多因素Cox分析筛选出了6个蛋白质用于预测模型的构建;生存分析发现与单个基因相比,预测模型表现出更强大的预后价值。单因素及多因素独立预后分析均提示预测模型风险评分与预后显著相关(P<0.001),预测模型可作为评估患者预后的独立风险因子;ROC分析显示预测模型在预后预测中表现出更加稳定的特异性和灵敏度(AUC=0.734)。蛋白质相互作用关系显示,蛋白质BID、SLC1A5及SRC_pY527与其他蛋白质表现出较明显的相关性(P<0.001),蛋白质SLC1A5及SRC_pY527与其他CRC相关的蛋白质间的相互作用最显著(SLC1A5与11种蛋白质间存在显著相关性;SRC_pY527与12种蛋白质间存在显著相关性,P<0.001);除INPP4B外,各蛋白相关基因在mRNA水平均呈差异表达(P<0.05)。结论 6 种蛋白质构建的预测模型对 CRC 具有较好的预后预测作用,同时蛋白质 SLC1A5 及SRC_pY527在CRC的预后中起关键的作用,尤其是蛋白质SRC_pY527可能通过SRC/ AKT/MAPK信号轴调节CRC的发生发展并有望为CRC的治疗提供新的靶标。

关键词: 结直肠癌;预测模型;预后;TCPA;生物信息学分析

Abstract: Objective To assess the value of the combination of multiple proteins in predicting the prognosis of colorectal cancer (CRC) through bioinformatics analysis. Method The protein expression and clinical data were downloaded from TCPA database. Perl and R were used to screen the prognostic-related proteins, and through Cox analysis, the proteins that served as independent prognostic factors of CRC were identified to build the prediction model. Survival analyses were conducted for each of the proteins included in the prediction model and the risk score of the model, and risk curves was drawn for the risk score and the patients' survival status to verify the performance of the model. Independent prognosis analysis and ROC analysis were used to assess the value and advantages of the model in prognosis prediction. The interactions between the proteins included in the model and the differential expressions of the key genes related with the proteins were analyzed. Results Six proteins were screened for model construction. Compared with a single gene, the model showed much greater prognostic value for CRC. Independent prognostic analysis showed that the risk score of the prediction model was significantly related with the prognosis (P<0.001), and the model could be used as an independent risk factor for prognostic assessment of the patients. ROC analysis showed that the model had good specificity and sensitivity for prognostic prediction (AUC=0.734). Protein interactions showed that BID, SLC1A5 and SRC_pY527 were significantly correlated with other proteins (P<0.001), and SLC1A5 and SRC_pY527 had the most significant interactions with other proteins (P<0.001). Except for those of INPP4B, the key genes related with the proteins in the prediction model had significant differential expressions at the mRNA level in CRC (P<0.05). Conclusion The prediction model constructed based on 6 proteins has good prognostic value for CRC. The proteins SLC1A5 and SRC_pY527 play key roles in the prognosis of CRC, and SRC_pY527 may regulate the occurrence and progression of CRC through the SRC/AKT/MAPK signal axis and thus may serve as a new therapeutic target of CRC.

Key words: colorectal cancer; predictive model; prognosis; TCPA; bioinformatics analysis