A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

Alessandro Guffanti, Michele Iacono, Paride Pelucchi, Namshin Kim, Giulia Soldà, Larry J. Croft, Ryan J. Taft, Ermanno Rizzi, Marjan Askarian-Amiri, Raoul J. Bonnal, Maurizio Callari, Flavio Mignone, Graziano Pesole, Giovanni Bertalot, Luigi Bernardi, Alberto Albertini, Christopher Lee, John S. Mattick, Ileana Zucchi, Gianluca De Bellis

Research output: Contribution to journalArticle

Abstract

Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.

Original languageEnglish
Article number163
JournalBMC Genomics
Volume10
DOIs
Publication statusPublished - Apr 20 2009

Fingerprint

High-Throughput Nucleotide Sequencing
Transcriptome
Gene Fusion
Computational Biology
Breast Neoplasms
Untranslated RNA
Protein Isoforms
Poly A
Gene Deletion
Gene Library
Neoplasms
Carcinoma
Gene Expression
Polymerase Chain Reaction

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Guffanti, A., Iacono, M., Pelucchi, P., Kim, N., Soldà, G., Croft, L. J., ... De Bellis, G. (2009). A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. BMC Genomics, 10, [163]. https://doi.org/10.1186/1471-2164-10-163

A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. / Guffanti, Alessandro; Iacono, Michele; Pelucchi, Paride; Kim, Namshin; Soldà, Giulia; Croft, Larry J.; Taft, Ryan J.; Rizzi, Ermanno; Askarian-Amiri, Marjan; Bonnal, Raoul J.; Callari, Maurizio; Mignone, Flavio; Pesole, Graziano; Bertalot, Giovanni; Bernardi, Luigi; Albertini, Alberto; Lee, Christopher; Mattick, John S.; Zucchi, Ileana; De Bellis, Gianluca.

In: BMC Genomics, Vol. 10, 163, 20.04.2009.

Research output: Contribution to journalArticle

Guffanti, A, Iacono, M, Pelucchi, P, Kim, N, Soldà, G, Croft, LJ, Taft, RJ, Rizzi, E, Askarian-Amiri, M, Bonnal, RJ, Callari, M, Mignone, F, Pesole, G, Bertalot, G, Bernardi, L, Albertini, A, Lee, C, Mattick, JS, Zucchi, I & De Bellis, G 2009, 'A transcriptional sketch of a primary human breast cancer by 454 deep sequencing', BMC Genomics, vol. 10, 163. https://doi.org/10.1186/1471-2164-10-163
Guffanti A, Iacono M, Pelucchi P, Kim N, Soldà G, Croft LJ et al. A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. BMC Genomics. 2009 Apr 20;10. 163. https://doi.org/10.1186/1471-2164-10-163
Guffanti, Alessandro ; Iacono, Michele ; Pelucchi, Paride ; Kim, Namshin ; Soldà, Giulia ; Croft, Larry J. ; Taft, Ryan J. ; Rizzi, Ermanno ; Askarian-Amiri, Marjan ; Bonnal, Raoul J. ; Callari, Maurizio ; Mignone, Flavio ; Pesole, Graziano ; Bertalot, Giovanni ; Bernardi, Luigi ; Albertini, Alberto ; Lee, Christopher ; Mattick, John S. ; Zucchi, Ileana ; De Bellis, Gianluca. / A transcriptional sketch of a primary human breast cancer by 454 deep sequencing. In: BMC Genomics. 2009 ; Vol. 10.
@article{56663111f41a4c87ba429ae56a8bec25,
title = "A transcriptional sketch of a primary human breast cancer by 454 deep sequencing",
abstract = "Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of {"}unconventional{"} transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.",
author = "Alessandro Guffanti and Michele Iacono and Paride Pelucchi and Namshin Kim and Giulia Sold{\`a} and Croft, {Larry J.} and Taft, {Ryan J.} and Ermanno Rizzi and Marjan Askarian-Amiri and Bonnal, {Raoul J.} and Maurizio Callari and Flavio Mignone and Graziano Pesole and Giovanni Bertalot and Luigi Bernardi and Alberto Albertini and Christopher Lee and Mattick, {John S.} and Ileana Zucchi and {De Bellis}, Gianluca",
year = "2009",
month = "4",
day = "20",
doi = "10.1186/1471-2164-10-163",
language = "English",
volume = "10",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

TY - JOUR

T1 - A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

AU - Guffanti, Alessandro

AU - Iacono, Michele

AU - Pelucchi, Paride

AU - Kim, Namshin

AU - Soldà, Giulia

AU - Croft, Larry J.

AU - Taft, Ryan J.

AU - Rizzi, Ermanno

AU - Askarian-Amiri, Marjan

AU - Bonnal, Raoul J.

AU - Callari, Maurizio

AU - Mignone, Flavio

AU - Pesole, Graziano

AU - Bertalot, Giovanni

AU - Bernardi, Luigi

AU - Albertini, Alberto

AU - Lee, Christopher

AU - Mattick, John S.

AU - Zucchi, Ileana

AU - De Bellis, Gianluca

PY - 2009/4/20

Y1 - 2009/4/20

N2 - Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.

AB - Background: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. Results: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Conclusion: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.

UR - http://www.scopus.com/inward/record.url?scp=65549155761&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65549155761&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-10-163

DO - 10.1186/1471-2164-10-163

M3 - Article

C2 - 19379481

AN - SCOPUS:65549155761

VL - 10

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 163

ER -