SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in: C. elegans

Ivan Kel, Zisong Chang, Nadia Galluccio, Margherita Romeo, Stefano Beretta, Luisa Diomede, Alessandra Mezzelani, Luciano Milanesi, Christoph Dieterich, Ivan Merelli

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.

Original languageEnglish
Pages (from-to)3447-3458
Number of pages12
JournalMolecular BioSystems
Volume12
Issue number11
DOIs
Publication statusPublished - 2016

Fingerprint

Quantitative Trait Loci
MicroRNAs
Software
RNA
Genotype
Genes
Chromosomes, Human, Pair 1
Genome-Wide Association Study
Embryonic Development
Transcription Factors
Genome
Phenotype
Proteins

ASJC Scopus subject areas

  • Biotechnology
  • Molecular Biology

Cite this

SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in : C. elegans. / Kel, Ivan; Chang, Zisong; Galluccio, Nadia; Romeo, Margherita; Beretta, Stefano; Diomede, Luisa; Mezzelani, Alessandra; Milanesi, Luciano; Dieterich, Christoph; Merelli, Ivan.

In: Molecular BioSystems, Vol. 12, No. 11, 2016, p. 3447-3458.

Research output: Contribution to journalArticle

Kel, I, Chang, Z, Galluccio, N, Romeo, M, Beretta, S, Diomede, L, Mezzelani, A, Milanesi, L, Dieterich, C & Merelli, I 2016, 'SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in: C. elegans', Molecular BioSystems, vol. 12, no. 11, pp. 3447-3458. https://doi.org/10.1039/c6mb00453a
Kel, Ivan ; Chang, Zisong ; Galluccio, Nadia ; Romeo, Margherita ; Beretta, Stefano ; Diomede, Luisa ; Mezzelani, Alessandra ; Milanesi, Luciano ; Dieterich, Christoph ; Merelli, Ivan. / SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in : C. elegans. In: Molecular BioSystems. 2016 ; Vol. 12, No. 11. pp. 3447-3458.
@article{8ea2cbb0c938426f9f106c78aad55b2d,
title = "SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in: C. elegans",
abstract = "The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.",
author = "Ivan Kel and Zisong Chang and Nadia Galluccio and Margherita Romeo and Stefano Beretta and Luisa Diomede and Alessandra Mezzelani and Luciano Milanesi and Christoph Dieterich and Ivan Merelli",
year = "2016",
doi = "10.1039/c6mb00453a",
language = "English",
volume = "12",
pages = "3447--3458",
journal = "Molecular BioSystems",
issn = "1742-206X",
publisher = "Royal Society of Chemistry",
number = "11",

}

TY - JOUR

T1 - SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in

T2 - C. elegans

AU - Kel, Ivan

AU - Chang, Zisong

AU - Galluccio, Nadia

AU - Romeo, Margherita

AU - Beretta, Stefano

AU - Diomede, Luisa

AU - Mezzelani, Alessandra

AU - Milanesi, Luciano

AU - Dieterich, Christoph

AU - Merelli, Ivan

PY - 2016

Y1 - 2016

N2 - The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.

AB - The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at https://bitbucket.org/bereste/spire, together with some example data, a readme file, supplementary material and a short tutorial.

UR - http://www.scopus.com/inward/record.url?scp=84992092930&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84992092930&partnerID=8YFLogxK

U2 - 10.1039/c6mb00453a

DO - 10.1039/c6mb00453a

M3 - Article

AN - SCOPUS:84992092930

VL - 12

SP - 3447

EP - 3458

JO - Molecular BioSystems

JF - Molecular BioSystems

SN - 1742-206X

IS - 11

ER -