A genomic data fusion framework to exploit rare and common variants for association discovery

Simone Marini, Ivan Limongelli, Ettore Rizzo, Tan Da, Riccardo Bellazzi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Collapsing methods are used in association studies to exploit the effect of genetic rare variants in diseases. In this work we model an enriched collapsing approach by including genes, protein domains, pathways and protein-protein interactions data. We applied the collapsing technique to a data set of epileptic (85 cases) and healthy (61 controls) subjects. The method retrieved 4 genes, 5 domains, 33 gene interactions and 14 pathways showing a significant association with the disease. Collapsed data have been also used as features for prediction models. We found that the use of protein-protein interactions as model features increases the area under ROC curve (+1. 5%) if compared to the solely gene-based approach.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages101-105
Number of pages5
Volume9105
ISBN (Print)9783319195506
DOIs
Publication statusPublished - 2015
Event15th Conference on Artificial Intelligence in Medicine, AIME 2015 - Pavia, Italy
Duration: Jun 17 2015Jun 20 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9105
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other15th Conference on Artificial Intelligence in Medicine, AIME 2015
CountryItaly
CityPavia
Period6/17/156/20/15

Fingerprint

Data Fusion
Data fusion
Genomics
Collapsing
Gene
Proteins
Genes
Protein-protein Interaction
Pathway
Feature Model
Receiver Operating Characteristic Curve
Prediction Model
Protein
Framework
Interaction
Model

Keywords

  • Associations study
  • Collapsing method
  • Epilepsy
  • Genetic pathway
  • Machine learning
  • Protein domain
  • Protein-protein interaction
  • Rare genetic variants

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Marini, S., Limongelli, I., Rizzo, E., Da, T., & Bellazzi, R. (2015). A genomic data fusion framework to exploit rare and common variants for association discovery. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9105, pp. 101-105). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9105). Springer Verlag. https://doi.org/10.1007/978-3-319-19551-3_12

A genomic data fusion framework to exploit rare and common variants for association discovery. / Marini, Simone; Limongelli, Ivan; Rizzo, Ettore; Da, Tan; Bellazzi, Riccardo.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9105 Springer Verlag, 2015. p. 101-105 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9105).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Marini, S, Limongelli, I, Rizzo, E, Da, T & Bellazzi, R 2015, A genomic data fusion framework to exploit rare and common variants for association discovery. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 9105, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9105, Springer Verlag, pp. 101-105, 15th Conference on Artificial Intelligence in Medicine, AIME 2015, Pavia, Italy, 6/17/15. https://doi.org/10.1007/978-3-319-19551-3_12
Marini S, Limongelli I, Rizzo E, Da T, Bellazzi R. A genomic data fusion framework to exploit rare and common variants for association discovery. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9105. Springer Verlag. 2015. p. 101-105. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-19551-3_12
Marini, Simone ; Limongelli, Ivan ; Rizzo, Ettore ; Da, Tan ; Bellazzi, Riccardo. / A genomic data fusion framework to exploit rare and common variants for association discovery. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9105 Springer Verlag, 2015. pp. 101-105 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7e5e24b03a5c4a51860f3b4bcab2c26e,
title = "A genomic data fusion framework to exploit rare and common variants for association discovery",
abstract = "Collapsing methods are used in association studies to exploit the effect of genetic rare variants in diseases. In this work we model an enriched collapsing approach by including genes, protein domains, pathways and protein-protein interactions data. We applied the collapsing technique to a data set of epileptic (85 cases) and healthy (61 controls) subjects. The method retrieved 4 genes, 5 domains, 33 gene interactions and 14 pathways showing a significant association with the disease. Collapsed data have been also used as features for prediction models. We found that the use of protein-protein interactions as model features increases the area under ROC curve (+1. 5{\%}) if compared to the solely gene-based approach.",
keywords = "Associations study, Collapsing method, Epilepsy, Genetic pathway, Machine learning, Protein domain, Protein-protein interaction, Rare genetic variants",
author = "Simone Marini and Ivan Limongelli and Ettore Rizzo and Tan Da and Riccardo Bellazzi",
year = "2015",
doi = "10.1007/978-3-319-19551-3_12",
language = "English",
isbn = "9783319195506",
volume = "9105",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "101--105",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - A genomic data fusion framework to exploit rare and common variants for association discovery

AU - Marini, Simone

AU - Limongelli, Ivan

AU - Rizzo, Ettore

AU - Da, Tan

AU - Bellazzi, Riccardo

PY - 2015

Y1 - 2015

N2 - Collapsing methods are used in association studies to exploit the effect of genetic rare variants in diseases. In this work we model an enriched collapsing approach by including genes, protein domains, pathways and protein-protein interactions data. We applied the collapsing technique to a data set of epileptic (85 cases) and healthy (61 controls) subjects. The method retrieved 4 genes, 5 domains, 33 gene interactions and 14 pathways showing a significant association with the disease. Collapsed data have been also used as features for prediction models. We found that the use of protein-protein interactions as model features increases the area under ROC curve (+1. 5%) if compared to the solely gene-based approach.

AB - Collapsing methods are used in association studies to exploit the effect of genetic rare variants in diseases. In this work we model an enriched collapsing approach by including genes, protein domains, pathways and protein-protein interactions data. We applied the collapsing technique to a data set of epileptic (85 cases) and healthy (61 controls) subjects. The method retrieved 4 genes, 5 domains, 33 gene interactions and 14 pathways showing a significant association with the disease. Collapsed data have been also used as features for prediction models. We found that the use of protein-protein interactions as model features increases the area under ROC curve (+1. 5%) if compared to the solely gene-based approach.

KW - Associations study

KW - Collapsing method

KW - Epilepsy

KW - Genetic pathway

KW - Machine learning

KW - Protein domain

KW - Protein-protein interaction

KW - Rare genetic variants

UR - http://www.scopus.com/inward/record.url?scp=84947937748&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84947937748&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-19551-3_12

DO - 10.1007/978-3-319-19551-3_12

M3 - Conference contribution

AN - SCOPUS:84947937748

SN - 9783319195506

VL - 9105

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 101

EP - 105

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -