Fish the ChIPs: A pipeline for automated genomic annotation of ChIP-Seq data

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Background: High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC), a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time.Results: Starting from short read sequences, FC performs the following steps: 1) quality controls, 2) alignment to a reference genome, 3) peak calling, 4) genomic annotation, 5) generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform.Conclusions: Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses.Reviewers: This article was reviewed by Gavin Huttley, George Shpakovski and Sarah Teichmann.

Original languageEnglish
Article number51
JournalBiology Direct
Volume6
DOIs
Publication statusPublished - Oct 6 2011

Fingerprint

Fish
Genomics
Annotation
genomics
Fishes
Chip
Pipelines
Genome
fish
genome
Genes
Quality Control
Throughput
reproducibility
biologists
Reproducibility
data analysis
Sequencing
Data analysis
user interface

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology
  • Applied Mathematics
  • Modelling and Simulation
  • Ecology, Evolution, Behavior and Systematics

Cite this

@article{1e68d445fa884ebba54bdbffbc6885b0,
title = "Fish the ChIPs: A pipeline for automated genomic annotation of ChIP-Seq data",
abstract = "Background: High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC), a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time.Results: Starting from short read sequences, FC performs the following steps: 1) quality controls, 2) alignment to a reference genome, 3) peak calling, 4) genomic annotation, 5) generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform.Conclusions: Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses.Reviewers: This article was reviewed by Gavin Huttley, George Shpakovski and Sarah Teichmann.",
author = "Iros Barozzi and Alberto Termanini and Saverio Minucci and Gioacchino Natoli",
year = "2011",
month = "10",
day = "6",
doi = "10.1186/1745-6150-6-51",
language = "English",
volume = "6",
journal = "Biology Direct",
issn = "1745-6150",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Fish the ChIPs

T2 - A pipeline for automated genomic annotation of ChIP-Seq data

AU - Barozzi, Iros

AU - Termanini, Alberto

AU - Minucci, Saverio

AU - Natoli, Gioacchino

PY - 2011/10/6

Y1 - 2011/10/6

N2 - Background: High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC), a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time.Results: Starting from short read sequences, FC performs the following steps: 1) quality controls, 2) alignment to a reference genome, 3) peak calling, 4) genomic annotation, 5) generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform.Conclusions: Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses.Reviewers: This article was reviewed by Gavin Huttley, George Shpakovski and Sarah Teichmann.

AB - Background: High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC), a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time.Results: Starting from short read sequences, FC performs the following steps: 1) quality controls, 2) alignment to a reference genome, 3) peak calling, 4) genomic annotation, 5) generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform.Conclusions: Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses.Reviewers: This article was reviewed by Gavin Huttley, George Shpakovski and Sarah Teichmann.

UR - http://www.scopus.com/inward/record.url?scp=80053524735&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053524735&partnerID=8YFLogxK

U2 - 10.1186/1745-6150-6-51

DO - 10.1186/1745-6150-6-51

M3 - Article

C2 - 21978789

AN - SCOPUS:80053524735

VL - 6

JO - Biology Direct

JF - Biology Direct

SN - 1745-6150

M1 - 51

ER -