Combining pathway identification and survival prediction via screening-network analysis

by Annalisa Occhipinti, PhD Student Computer Science on 14 June 2015


12th Annual Meeting of the Bioinformatics Italian Society, 3-5 June 2015, Milan, Italy.

Presentation: Combining pathway identification and survival prediction via screening-network analysis

Annalisa Occhipinti1, Antonella Iuliano2, Claudia Angelini2, Italia De Feis2; and Pietro Lio11

1 = Computer Laboratory, University of Cambridge, UK.

2 = Istituto per le Applicazioni del Calcolo “Mauro Picone”, CNR-Naples, Italy.

Annalisa Occhipinti is a second year PhD student in Computer Science. She is working on several applications of Computer Science and Mathematics in cancer research.  Annalisa presented her work at the 12th meeting of the Bioinformatics Italian Society in Milan.

Nowadays, gene expression data from high-throughput assays, such as microarray, are often used to predict cancer survival.  However, available datasets consist of a small number of samples (n patients) and a large number of gene expression data (p predictors). Therefore, the main challenge is to cope with the high-dimensionality, i.e. p>>n and a novel appealing approach is to use screening procedures to reduce the size of the feature space to a moderate scale.

In addition, genes are often co-regulated and their expression levels are expected to be highly correlated. Genes that are involved in the same biological process are grouped in pathway structures. In order to incorporate the pathway information of genes, network-based methods have been applied.

Motivated by the most recent models based on variable screening techniques and integration of pathway information into penalized Cox methods, Annalisa’s work proposes a new procedure to obtain more accurate predictions. First, the method identifies the high-risk genes by using variable screening techniques and then, it performs Cox regression analysis integrating network information related to the selected high-risk genes. By combining these two approaches, variable screening and network, the new method selects important core pathways and genes that are related to the survival outcome.

The new approach combines variable screening procedures and network-penalized Cox models for high-dimensional survival data aimed at determining pathway structures and biomarkers involved in cancer progression. By using this approach, it is possible to obtain a deeper understanding of the gene-regulatory networks and investigate the gene signatures related to the cancer survival time in order to know how patient features (molecular and clinical information) can influence cancer treatment and detection.

The model also predicts patient survival using molecular data of different cancer types, such as ovarian and breast cancer. This approach can also investigate the set of the active signature genes and the corresponding pathways involved in the cancer disease process.

Overall this study shows that the new screening-network analysis is useful for improving the accuracy of survival prediction in discovering signature genes across independent datasets.