Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments

Stefano Parodi, Vito Pistoia, Marco Muselli

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Background: Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods. Results: We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%). Conclusion: NPRC represent a new useful tool for the analysis of microarray data.

Original languageEnglish
Article number410
JournalBMC Bioinformatics
Volume9
DOIs
Publication statusPublished - Oct 3 2008

Fingerprint

Receiver Operating Characteristic Curve
Microarrays
Microarray
ROC Curve
Genes
Gene
Experiment
Experiments
B Cells
Area Under Curve
B-Lymphocytes
Gene expression
Follicular Lymphoma
Leukemia
Selection Procedures
Inspection
B-Cell Chronic Lymphocytic Leukemia
Microarray Analysis
Cells
Microarray Data

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments. / Parodi, Stefano; Pistoia, Vito; Muselli, Marco.

In: BMC Bioinformatics, Vol. 9, 410, 03.10.2008.

Research output: Contribution to journalArticle

@article{cf1e49babcf84273a6dd051cf6c77cab,
title = "Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments",
abstract = "Background: Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods. Results: We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15{\%}. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81{\%}). Conclusion: NPRC represent a new useful tool for the analysis of microarray data.",
author = "Stefano Parodi and Vito Pistoia and Marco Muselli",
year = "2008",
month = "10",
day = "3",
doi = "10.1186/1471-2105-9-410",
language = "English",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments

AU - Parodi, Stefano

AU - Pistoia, Vito

AU - Muselli, Marco

PY - 2008/10/3

Y1 - 2008/10/3

N2 - Background: Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods. Results: We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%). Conclusion: NPRC represent a new useful tool for the analysis of microarray data.

AB - Background: Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods. Results: We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%). Conclusion: NPRC represent a new useful tool for the analysis of microarray data.

UR - http://www.scopus.com/inward/record.url?scp=55349084073&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=55349084073&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-410

DO - 10.1186/1471-2105-9-410

M3 - Article

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 410

ER -