南方医科大学学报 ›› 2020, Vol. 40 ›› Issue (04): 475-482.doi: 10.12122/j.issn.1673-4254.2020.04.05

• • 上一篇    下一篇

条件推断森林在生存分析中的应用

刘颖欣,康 佩,许 军,安胜利   

  • 出版日期:2020-04-30 发布日期:2020-04-20
  • 基金资助:

Application of conditional inference forest in time-to-event data analysis

  

  • Online:2020-04-30 Published:2020-04-20

摘要: 目的 探讨条件推断森林在生存分析中的应用与优势。方法 通过模拟研究和实例应用比较比例风险模型、加速失效时间模型、随机生存森林、条件推断森林4种方法的预测能力,用Brier score进行评价。结果 模拟研究显示两类森林模型比其他2种回归模型预测更准确稳定,其中条件推断森林在数据存在多分类变量、共线性、交互作用等情况下预测效果优于其余3种模型,且在大样本、高删失率数据中更容易体现该优势;实例说明条件推断森林预测效果最优。结论 条件推断森林可用于生存分析,且当存在多分类变量、共线性、交互作用时,与其他常见生存分析方法相比,具有更高的准确性和稳定性。

Abstract: Objective To explore the application and advantages of conditional inference forest in survival analysis. Methods We used simulated experiment and actual data to compare the predictive performance of 4 models, including Coxproportional hazards model, accelerated failure time model, random survival forest model and conditional inference forest model based on their Brier scores. Results Simulation experiment suggested that both of the two forest models had more accurate and robust predictive performance than the other two regression models. Conditional inference forest model was superior to the other models in analyzing time-to-event data with polytomous covariates, collinearity or interaction, especially for a large sample size and a high censoring rate. The results of actual data analysis demonstrated that conditional inference forest model had the best predictive performance among the 4 models. Conclusion Compared with the commonly used survival analysis methods, conditional inference forest model performs better especially when the data contain polytomous covariates with collinearity and interaction.