Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries

Elena Grassi, Elisa Mariella, Antonio Lembo, Ivan Molineris, Paolo Provero

Research output: Contribution to journalArticle

Abstract

Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.
Original languageEnglish
Article number423
Pages (from-to)-
JournalBMC Bioinformatics
Volume17
Issue number1
DOIs
Publication statusPublished - Oct 18 2016

Fingerprint

Polyadenylation
3' Untranslated Regions
Messenger RNA
Sequencing
Libraries
Differential Expression
RNA
Alternatives
Databases
RNA Sequence Analysis
Biological Phenomena
Transcriptional Regulation
MicroRNA
Gene Regulation
Microarrays
MicroRNAs
Gene expression
Microarray
Assays
Protein Isoforms

Keywords

  • 3' UTR
  • Bioconductor
  • Polyadenylation
  • RNA-sequencing
  • Software

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Grassi, E., Mariella, E., Lembo, A., Molineris, I., & Provero, P. (2016). Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries. BMC Bioinformatics, 17(1), -. [423]. https://doi.org/10.1186/s12859-016-1254-8

Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries. / Grassi, Elena; Mariella, Elisa; Lembo, Antonio; Molineris, Ivan; Provero, Paolo.

In: BMC Bioinformatics, Vol. 17, No. 1, 423, 18.10.2016, p. -.

Research output: Contribution to journalArticle

Grassi, E, Mariella, E, Lembo, A, Molineris, I & Provero, P 2016, 'Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries', BMC Bioinformatics, vol. 17, no. 1, 423, pp. -. https://doi.org/10.1186/s12859-016-1254-8
Grassi, Elena ; Mariella, Elisa ; Lembo, Antonio ; Molineris, Ivan ; Provero, Paolo. / Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries. In: BMC Bioinformatics. 2016 ; Vol. 17, No. 1. pp. -.
@article{78118e8c631f430bb800cfb1a47426f1,
title = "Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries",
abstract = "Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.",
keywords = "3' UTR, Bioconductor, Polyadenylation, RNA-sequencing, Software",
author = "Elena Grassi and Elisa Mariella and Antonio Lembo and Ivan Molineris and Paolo Provero",
year = "2016",
month = "10",
day = "18",
doi = "10.1186/s12859-016-1254-8",
language = "English",
volume = "17",
pages = "--",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",
number = "1",

}

TY - JOUR

T1 - Roar: Detecting alternative polyadenylation with standard mRNA sequencing libraries

AU - Grassi, Elena

AU - Mariella, Elisa

AU - Lembo, Antonio

AU - Molineris, Ivan

AU - Provero, Paolo

PY - 2016/10/18

Y1 - 2016/10/18

N2 - Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.

AB - Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.

KW - 3' UTR

KW - Bioconductor

KW - Polyadenylation

KW - RNA-sequencing

KW - Software

UR - http://www.scopus.com/inward/record.url?scp=84992144136&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84992144136&partnerID=8YFLogxK

U2 - 10.1186/s12859-016-1254-8

DO - 10.1186/s12859-016-1254-8

M3 - Article

VL - 17

SP - -

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 1

M1 - 423

ER -