A Toolbox for Functional Analysis and the Systematic Identification of Diagnostic and Prognostic Gene Expression Signatures Combining Meta-Analysis and Machine Learning

Johannes Vey, Lorenz A. Kapsner, Maximilian Fuchs, Philipp Unberath, Giulia Veronesi, Meik Kunz

Research output: Contribution to journalArticle


The identification of biomarker signatures is important for cancer diagnosis and prognosis. However, the detection of clinical reliable signatures is influenced by limited data availability, which may restrict statistical power. Moreover, methods for integration of large sample cohorts and signature identification are limited. We present a step-by-step computational protocol for functional gene expression analysis and the identification of diagnostic and prognostic signatures by combining meta-analysis with machine learning and survival analysis. The novelty of the toolbox lies in its all-in-one functionality, generic design, and modularity. It is exemplified for lung cancer, including a comprehensive evaluation using different validation strategies. However, the protocol is not restricted to specific disease types and can therefore be used by a broad community. The accompanying R package vignette runs in ~1 h and describes the workflow in detail for use by researchers with limited bioinformatics training.

Original languageEnglish
Issue number10
Publication statusPublished - Oct 21 2019


Cite this