The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines

Paolo Fardin, Annalisa Barla, Sofia Mosci, Lorenzo Rosasco, Alessandro Verri, Luigi Varesio

Research output: Contribution to journalArticle

Abstract

Background: Gene expression signatures are clusters of genes discriminating different statuses of the cells and their definition is critical for understanding the molecular bases of diseases. The identification of a gene signature is complicated by the high dimensional nature of the data and by the genetic heterogeneity of the responding cells. The l1-l2 regularization is an embedded feature selection technique that fulfills all the desirable properties of a variable selection algorithm and has the potential to generate a specific signature even in biologically complex settings. We studied the application of this algorithm to detect the signature characterizing the transcriptional response of neuroblastoma tumor cell lines to hypoxia, a condition of low oxygen tension that occurs in the tumor microenvironment. Results: We determined the gene expression profile of 9 neuroblastoma cell lines cultured under normoxic and hypoxic conditions. We studied a heterogeneous set of neuroblastoma cell lines to mimic the in vivo situation and to test the robustness and validity of the l1-l2 regularization with double optimization. Analysis by hierarchical, spectral, and k-means clustering or supervised approach based on t-test analysis divided the cell lines on the bases of genetic differences. However, the disturbance of this strong transcriptional response completely masked the detection of the more subtle response to hypoxia. Different results were obtained when we applied the l1-l2 regularization framework. The algorithm distinguished the normoxic and hypoxic statuses defining signatures comprising 3 to 38 probesets, with a leave-one-out error of 17%. A consensus hypoxia signature was established setting the frequency score at 50% and the correlation parameter ε equal to 100. This signature is composed by 11 probesets representing 8 well characterized genes known to be modulated by hypoxia. Conclusion: We demonstrate that l1-l2 regularization outperforms more conventional approaches allowing the identification and definition of a gene expression signature under complex experimental conditions. The l1-l2 regularization and the cross validation generates an unbiased and objective output with a low classification error. We feel that the application of this algorithm to tumor biology will be instrumental to analyze gene expression signatures hidden in the transcriptome that, like hypoxia, may be major determinant of the course of the disease.

Original languageEnglish
Article number1471
Pages (from-to)474
Number of pages1
JournalBMC Genomics
Volume10
DOIs
Publication statusPublished - Oct 15 2009

Fingerprint

Neuroblastoma
Transcriptome
Cell Line
Tumor Microenvironment
Genetic Heterogeneity
Multigene Family
Tumor Cell Line
Genes
Cluster Analysis
Hypoxia
Oxygen
Neoplasms

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines. / Fardin, Paolo; Barla, Annalisa; Mosci, Sofia; Rosasco, Lorenzo; Verri, Alessandro; Varesio, Luigi.

In: BMC Genomics, Vol. 10, 1471, 15.10.2009, p. 474.

Research output: Contribution to journalArticle

Fardin, Paolo ; Barla, Annalisa ; Mosci, Sofia ; Rosasco, Lorenzo ; Verri, Alessandro ; Varesio, Luigi. / The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines. In: BMC Genomics. 2009 ; Vol. 10. pp. 474.
@article{c11ba35dc82c42978408e3cd8dd3ae11,
title = "The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines",
abstract = "Background: Gene expression signatures are clusters of genes discriminating different statuses of the cells and their definition is critical for understanding the molecular bases of diseases. The identification of a gene signature is complicated by the high dimensional nature of the data and by the genetic heterogeneity of the responding cells. The l1-l2 regularization is an embedded feature selection technique that fulfills all the desirable properties of a variable selection algorithm and has the potential to generate a specific signature even in biologically complex settings. We studied the application of this algorithm to detect the signature characterizing the transcriptional response of neuroblastoma tumor cell lines to hypoxia, a condition of low oxygen tension that occurs in the tumor microenvironment. Results: We determined the gene expression profile of 9 neuroblastoma cell lines cultured under normoxic and hypoxic conditions. We studied a heterogeneous set of neuroblastoma cell lines to mimic the in vivo situation and to test the robustness and validity of the l1-l2 regularization with double optimization. Analysis by hierarchical, spectral, and k-means clustering or supervised approach based on t-test analysis divided the cell lines on the bases of genetic differences. However, the disturbance of this strong transcriptional response completely masked the detection of the more subtle response to hypoxia. Different results were obtained when we applied the l1-l2 regularization framework. The algorithm distinguished the normoxic and hypoxic statuses defining signatures comprising 3 to 38 probesets, with a leave-one-out error of 17{\%}. A consensus hypoxia signature was established setting the frequency score at 50{\%} and the correlation parameter ε equal to 100. This signature is composed by 11 probesets representing 8 well characterized genes known to be modulated by hypoxia. Conclusion: We demonstrate that l1-l2 regularization outperforms more conventional approaches allowing the identification and definition of a gene expression signature under complex experimental conditions. The l1-l2 regularization and the cross validation generates an unbiased and objective output with a low classification error. We feel that the application of this algorithm to tumor biology will be instrumental to analyze gene expression signatures hidden in the transcriptome that, like hypoxia, may be major determinant of the course of the disease.",
author = "Paolo Fardin and Annalisa Barla and Sofia Mosci and Lorenzo Rosasco and Alessandro Verri and Luigi Varesio",
year = "2009",
month = "10",
day = "15",
doi = "10.1186/1471-2164-10-474",
language = "English",
volume = "10",
pages = "474",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

TY - JOUR

T1 - The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines

AU - Fardin, Paolo

AU - Barla, Annalisa

AU - Mosci, Sofia

AU - Rosasco, Lorenzo

AU - Verri, Alessandro

AU - Varesio, Luigi

PY - 2009/10/15

Y1 - 2009/10/15

N2 - Background: Gene expression signatures are clusters of genes discriminating different statuses of the cells and their definition is critical for understanding the molecular bases of diseases. The identification of a gene signature is complicated by the high dimensional nature of the data and by the genetic heterogeneity of the responding cells. The l1-l2 regularization is an embedded feature selection technique that fulfills all the desirable properties of a variable selection algorithm and has the potential to generate a specific signature even in biologically complex settings. We studied the application of this algorithm to detect the signature characterizing the transcriptional response of neuroblastoma tumor cell lines to hypoxia, a condition of low oxygen tension that occurs in the tumor microenvironment. Results: We determined the gene expression profile of 9 neuroblastoma cell lines cultured under normoxic and hypoxic conditions. We studied a heterogeneous set of neuroblastoma cell lines to mimic the in vivo situation and to test the robustness and validity of the l1-l2 regularization with double optimization. Analysis by hierarchical, spectral, and k-means clustering or supervised approach based on t-test analysis divided the cell lines on the bases of genetic differences. However, the disturbance of this strong transcriptional response completely masked the detection of the more subtle response to hypoxia. Different results were obtained when we applied the l1-l2 regularization framework. The algorithm distinguished the normoxic and hypoxic statuses defining signatures comprising 3 to 38 probesets, with a leave-one-out error of 17%. A consensus hypoxia signature was established setting the frequency score at 50% and the correlation parameter ε equal to 100. This signature is composed by 11 probesets representing 8 well characterized genes known to be modulated by hypoxia. Conclusion: We demonstrate that l1-l2 regularization outperforms more conventional approaches allowing the identification and definition of a gene expression signature under complex experimental conditions. The l1-l2 regularization and the cross validation generates an unbiased and objective output with a low classification error. We feel that the application of this algorithm to tumor biology will be instrumental to analyze gene expression signatures hidden in the transcriptome that, like hypoxia, may be major determinant of the course of the disease.

AB - Background: Gene expression signatures are clusters of genes discriminating different statuses of the cells and their definition is critical for understanding the molecular bases of diseases. The identification of a gene signature is complicated by the high dimensional nature of the data and by the genetic heterogeneity of the responding cells. The l1-l2 regularization is an embedded feature selection technique that fulfills all the desirable properties of a variable selection algorithm and has the potential to generate a specific signature even in biologically complex settings. We studied the application of this algorithm to detect the signature characterizing the transcriptional response of neuroblastoma tumor cell lines to hypoxia, a condition of low oxygen tension that occurs in the tumor microenvironment. Results: We determined the gene expression profile of 9 neuroblastoma cell lines cultured under normoxic and hypoxic conditions. We studied a heterogeneous set of neuroblastoma cell lines to mimic the in vivo situation and to test the robustness and validity of the l1-l2 regularization with double optimization. Analysis by hierarchical, spectral, and k-means clustering or supervised approach based on t-test analysis divided the cell lines on the bases of genetic differences. However, the disturbance of this strong transcriptional response completely masked the detection of the more subtle response to hypoxia. Different results were obtained when we applied the l1-l2 regularization framework. The algorithm distinguished the normoxic and hypoxic statuses defining signatures comprising 3 to 38 probesets, with a leave-one-out error of 17%. A consensus hypoxia signature was established setting the frequency score at 50% and the correlation parameter ε equal to 100. This signature is composed by 11 probesets representing 8 well characterized genes known to be modulated by hypoxia. Conclusion: We demonstrate that l1-l2 regularization outperforms more conventional approaches allowing the identification and definition of a gene expression signature under complex experimental conditions. The l1-l2 regularization and the cross validation generates an unbiased and objective output with a low classification error. We feel that the application of this algorithm to tumor biology will be instrumental to analyze gene expression signatures hidden in the transcriptome that, like hypoxia, may be major determinant of the course of the disease.

UR - http://www.scopus.com/inward/record.url?scp=70449726777&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449726777&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-10-474

DO - 10.1186/1471-2164-10-474

M3 - Article

C2 - 19832978

AN - SCOPUS:70449726777

VL - 10

SP - 474

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 1471

ER -