Mulcom: A multiple comparison statistical test for microarray data in Bioconductor

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Background: Many microarray experiments search for genes with differential expression between a common "reference" group and multiple "test" groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups.Results: We implemented an R-Bioconductor package named Mulcom, with a statistical test derived from the Dunnett's t-test, designed to compare multiple test groups individually against a common reference. Interestingly, the Dunnett's test uses for the denominator of each comparison a within-group standard error aggregated from all the experimental groups. In addition to the basic Dunnett's t value, the package includes an optional minimal fold-change threshold, m. Due to the automated, permutation-based estimation of False Discovery Rate (FDR), the package also permits fast optimization of the test, to obtain the maximum number of significant genes at a given FDR value. When applied to a time-course experiment profiled in parallel on two microarray platforms, and compared with two commonly used tests, Mulcom displayed better concordance of significant genes in the two array platforms (39% vs. 26% or 15%), and higher enrichment in functional annotation to categories related to the biology of the experiment (p value <0.001 in 4 categories vs. 3).Conclusions: The Mulcom package provides a powerful tool for the identification of differentially expressed genes when several experimental conditions are compared against a common reference. The results of the practical example presented here show that lists of differentially expressed genes generated by Mulcom are particularly consistent across microarray platforms and enriched in genes belonging to functionally significant groups.

Original languageEnglish
Article number382
JournalBMC Bioinformatics
Volume12
DOIs
Publication statusPublished - Sep 28 2011

Fingerprint

Multiple Comparisons
Statistical tests
Microarrays
Statistical test
Microarray Data
Genes
Gene
t-test
Microarray
Multiple Tests
Experiments
Analysis of variance (ANOVA)
Standard error
Analysis of Variance
Derivatives
Experiment
Concordance
Differential Expression
Denominator
p-Value

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics
  • Structural Biology

Cite this

Mulcom : A multiple comparison statistical test for microarray data in Bioconductor. / Isella, Claudio; Renzulli, Tommaso; Corà, Davide; Medico, Enzo.

In: BMC Bioinformatics, Vol. 12, 382, 28.09.2011.

Research output: Contribution to journalArticle

@article{33cfb48c9ffc440f87df0f3d12f486ea,
title = "Mulcom: A multiple comparison statistical test for microarray data in Bioconductor",
abstract = "Background: Many microarray experiments search for genes with differential expression between a common {"}reference{"} group and multiple {"}test{"} groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups.Results: We implemented an R-Bioconductor package named Mulcom, with a statistical test derived from the Dunnett's t-test, designed to compare multiple test groups individually against a common reference. Interestingly, the Dunnett's test uses for the denominator of each comparison a within-group standard error aggregated from all the experimental groups. In addition to the basic Dunnett's t value, the package includes an optional minimal fold-change threshold, m. Due to the automated, permutation-based estimation of False Discovery Rate (FDR), the package also permits fast optimization of the test, to obtain the maximum number of significant genes at a given FDR value. When applied to a time-course experiment profiled in parallel on two microarray platforms, and compared with two commonly used tests, Mulcom displayed better concordance of significant genes in the two array platforms (39{\%} vs. 26{\%} or 15{\%}), and higher enrichment in functional annotation to categories related to the biology of the experiment (p value <0.001 in 4 categories vs. 3).Conclusions: The Mulcom package provides a powerful tool for the identification of differentially expressed genes when several experimental conditions are compared against a common reference. The results of the practical example presented here show that lists of differentially expressed genes generated by Mulcom are particularly consistent across microarray platforms and enriched in genes belonging to functionally significant groups.",
author = "Claudio Isella and Tommaso Renzulli and Davide Cor{\`a} and Enzo Medico",
year = "2011",
month = "9",
day = "28",
doi = "10.1186/1471-2105-12-382",
language = "English",
volume = "12",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - Mulcom

T2 - A multiple comparison statistical test for microarray data in Bioconductor

AU - Isella, Claudio

AU - Renzulli, Tommaso

AU - Corà, Davide

AU - Medico, Enzo

PY - 2011/9/28

Y1 - 2011/9/28

N2 - Background: Many microarray experiments search for genes with differential expression between a common "reference" group and multiple "test" groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups.Results: We implemented an R-Bioconductor package named Mulcom, with a statistical test derived from the Dunnett's t-test, designed to compare multiple test groups individually against a common reference. Interestingly, the Dunnett's test uses for the denominator of each comparison a within-group standard error aggregated from all the experimental groups. In addition to the basic Dunnett's t value, the package includes an optional minimal fold-change threshold, m. Due to the automated, permutation-based estimation of False Discovery Rate (FDR), the package also permits fast optimization of the test, to obtain the maximum number of significant genes at a given FDR value. When applied to a time-course experiment profiled in parallel on two microarray platforms, and compared with two commonly used tests, Mulcom displayed better concordance of significant genes in the two array platforms (39% vs. 26% or 15%), and higher enrichment in functional annotation to categories related to the biology of the experiment (p value <0.001 in 4 categories vs. 3).Conclusions: The Mulcom package provides a powerful tool for the identification of differentially expressed genes when several experimental conditions are compared against a common reference. The results of the practical example presented here show that lists of differentially expressed genes generated by Mulcom are particularly consistent across microarray platforms and enriched in genes belonging to functionally significant groups.

AB - Background: Many microarray experiments search for genes with differential expression between a common "reference" group and multiple "test" groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups.Results: We implemented an R-Bioconductor package named Mulcom, with a statistical test derived from the Dunnett's t-test, designed to compare multiple test groups individually against a common reference. Interestingly, the Dunnett's test uses for the denominator of each comparison a within-group standard error aggregated from all the experimental groups. In addition to the basic Dunnett's t value, the package includes an optional minimal fold-change threshold, m. Due to the automated, permutation-based estimation of False Discovery Rate (FDR), the package also permits fast optimization of the test, to obtain the maximum number of significant genes at a given FDR value. When applied to a time-course experiment profiled in parallel on two microarray platforms, and compared with two commonly used tests, Mulcom displayed better concordance of significant genes in the two array platforms (39% vs. 26% or 15%), and higher enrichment in functional annotation to categories related to the biology of the experiment (p value <0.001 in 4 categories vs. 3).Conclusions: The Mulcom package provides a powerful tool for the identification of differentially expressed genes when several experimental conditions are compared against a common reference. The results of the practical example presented here show that lists of differentially expressed genes generated by Mulcom are particularly consistent across microarray platforms and enriched in genes belonging to functionally significant groups.

UR - http://www.scopus.com/inward/record.url?scp=80053352043&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053352043&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-12-382

DO - 10.1186/1471-2105-12-382

M3 - Article

C2 - 21955789

AN - SCOPUS:80053352043

VL - 12

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 382

ER -