Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts

E. Dalla, F. Mignone, R. Verardo, L. Marchionni, S. Marzinotto, D. Lazarević, J. F. Reid, R. Marzio, E. Klarić, D. Licastro, G. Marcuzzi, R. Gambetta, M. A. Pierotti, G. Pesole, C. Schneider

Research output: Contribution to journalArticle

Abstract

In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5′-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5′ ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.

Original languageEnglish
Pages (from-to)739-751
Number of pages13
JournalGenomics
Volume85
Issue number6
DOIs
Publication statusPublished - Jun 2005

Fingerprint

Complementary DNA
Clone Cells
Genes
Cell Line
Oligonucleotide Array Sequence Analysis
Gene Library
Sequence Analysis
Vertebrates
Nucleotides
Databases
Gene Expression
Proteins

Keywords

  • cDNA microarrays
  • Full-length cDNA
  • Gene expression
  • Human transcriptome

ASJC Scopus subject areas

  • Genetics

Cite this

Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts. / Dalla, E.; Mignone, F.; Verardo, R.; Marchionni, L.; Marzinotto, S.; Lazarević, D.; Reid, J. F.; Marzio, R.; Klarić, E.; Licastro, D.; Marcuzzi, G.; Gambetta, R.; Pierotti, M. A.; Pesole, G.; Schneider, C.

In: Genomics, Vol. 85, No. 6, 06.2005, p. 739-751.

Research output: Contribution to journalArticle

Dalla, E, Mignone, F, Verardo, R, Marchionni, L, Marzinotto, S, Lazarević, D, Reid, JF, Marzio, R, Klarić, E, Licastro, D, Marcuzzi, G, Gambetta, R, Pierotti, MA, Pesole, G & Schneider, C 2005, 'Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts', Genomics, vol. 85, no. 6, pp. 739-751. https://doi.org/10.1016/j.ygeno.2005.02.009
Dalla, E. ; Mignone, F. ; Verardo, R. ; Marchionni, L. ; Marzinotto, S. ; Lazarević, D. ; Reid, J. F. ; Marzio, R. ; Klarić, E. ; Licastro, D. ; Marcuzzi, G. ; Gambetta, R. ; Pierotti, M. A. ; Pesole, G. ; Schneider, C. / Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts. In: Genomics. 2005 ; Vol. 85, No. 6. pp. 739-751.
@article{1e4043037a804365961bc472f9a5b807,
title = "Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts",
abstract = "In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5′-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40{\%} of our clones extended previously annotated 5′ ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.",
keywords = "cDNA microarrays, Full-length cDNA, Gene expression, Human transcriptome",
author = "E. Dalla and F. Mignone and R. Verardo and L. Marchionni and S. Marzinotto and D. Lazarević and Reid, {J. F.} and R. Marzio and E. Klarić and D. Licastro and G. Marcuzzi and R. Gambetta and Pierotti, {M. A.} and G. Pesole and C. Schneider",
year = "2005",
month = "6",
doi = "10.1016/j.ygeno.2005.02.009",
language = "English",
volume = "85",
pages = "739--751",
journal = "Genomics",
issn = "0888-7543",
publisher = "Academic Press Inc.",
number = "6",

}

TY - JOUR

T1 - Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts

AU - Dalla, E.

AU - Mignone, F.

AU - Verardo, R.

AU - Marchionni, L.

AU - Marzinotto, S.

AU - Lazarević, D.

AU - Reid, J. F.

AU - Marzio, R.

AU - Klarić, E.

AU - Licastro, D.

AU - Marcuzzi, G.

AU - Gambetta, R.

AU - Pierotti, M. A.

AU - Pesole, G.

AU - Schneider, C.

PY - 2005/6

Y1 - 2005/6

N2 - In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5′-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5′ ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.

AB - In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5′-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5′ ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.

KW - cDNA microarrays

KW - Full-length cDNA

KW - Gene expression

KW - Human transcriptome

UR - http://www.scopus.com/inward/record.url?scp=21044454176&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21044454176&partnerID=8YFLogxK

U2 - 10.1016/j.ygeno.2005.02.009

DO - 10.1016/j.ygeno.2005.02.009

M3 - Article

C2 - 15885500

AN - SCOPUS:21044454176

VL - 85

SP - 739

EP - 751

JO - Genomics

JF - Genomics

SN - 0888-7543

IS - 6

ER -