MaskedPainter: Feature selection for microarray data analysis

Daniele Apiletti, Elena Baralis, Giulia Bruno, Alessandro Fiori

Research output: Contribution to journalArticlepeer-review

Abstract

Selecting a small number of discriminative genes from thousands is a fundamental task in microarray data analysis. An effective feature selection allows biologists to investigate only a subset of genes instead of the entire set, thus avoiding insignificant, noisy, and redundant features. This paper presents the MaskedPainter feature selection method for gene expression data. The proposed method measures the ability of each gene to classify samples belonging to different classes and ranks genes by computing an overlap score. A density based technique is exploited to smooth the effects of outliers in the overlap score computation. Analogously to other approaches, the number of selected genes can be set by the user. However, our algorithm may automatically detect the minimum set of genes that yields the best classification coverage of training set samples. The effectiveness of our approach has been demonstrated through an empirical study on public microarray datasets with different characteristics. Experimental results show that the proposed approach yields a higher classification accuracy with respect to widely used feature selection techniques.

Original languageEnglish
Pages (from-to)717-737
Number of pages21
JournalIntelligent Data Analysis
Volume16
Issue number4
DOIs
Publication statusPublished - 2012

Keywords

  • data mining
  • Feature selection
  • microarray analysis
  • tumor classification

ASJC Scopus subject areas

  • Artificial Intelligence
  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'MaskedPainter: Feature selection for microarray data analysis'. Together they form a unique fingerprint.

Cite this