Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

Davide Corà, Ferdinando Di Cunto, Paolo Provero, Lorenzo Silengo, Michele Caselle

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Background: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.

Original languageEnglish
Article number57
JournalBMC Bioinformatics
Volume5
DOIs
Publication statusPublished - May 11 2004

Fingerprint

Functional analysis
Transcription factors
Functional Analysis
Binding sites
Transcription Factor
Sharing
Transcription Factors
Genes
Binding Sites
Gene
Gene Ontology
Transcriptional Regulation
Saccharomyces cerevisiae
Saccharomyces Cerevisiae
Ontology
Annotation
Coding
Likely
Target
Cell

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs. / Corà, Davide; Di Cunto, Ferdinando; Provero, Paolo; Silengo, Lorenzo; Caselle, Michele.

In: BMC Bioinformatics, Vol. 5, 57, 11.05.2004.

Research output: Contribution to journalArticle

@article{d175cdf1bbae4aaaa29128698bb0d84d,
title = "Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs",
abstract = "Background: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.",
author = "Davide Cor{\`a} and {Di Cunto}, Ferdinando and Paolo Provero and Lorenzo Silengo and Michele Caselle",
year = "2004",
month = "5",
day = "11",
doi = "10.1186/1471-2105-5-57",
language = "English",
volume = "5",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

AU - Corà, Davide

AU - Di Cunto, Ferdinando

AU - Provero, Paolo

AU - Silengo, Lorenzo

AU - Caselle, Michele

PY - 2004/5/11

Y1 - 2004/5/11

N2 - Background: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.

AB - Background: Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results: To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions: The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.

UR - http://www.scopus.com/inward/record.url?scp=2942580909&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2942580909&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-5-57

DO - 10.1186/1471-2105-5-57

M3 - Article

C2 - 15137914

AN - SCOPUS:2942580909

VL - 5

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 57

ER -