Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis

Gabriele Giarratana, Marco Pizzera, Marco Masseroli, Enzo Medico, Pier Luca Lanzi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to be used in the clinical practice to support therapy prescription. We randomly subdivided a gene expression dataset of 311 samples into a training set to learn the model and a test set to validate the model and assess its performance. We evaluated several learning algorithms in their not weighted and weighted form, which we defined to take into account the different clinical importance of false positive and false negative classifications. Based on our results, these last, especially when used in their combined form, appear to provide better results.

Original languageEnglish
Title of host publicationProceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009
Pages295-300
Number of pages6
DOIs
Publication statusPublished - 2009
Event2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009 - Taichung, Taiwan, Province of China
Duration: Jun 22 2009Jun 24 2009

Other

Other2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009
CountryTaiwan, Province of China
CityTaichung
Period6/22/096/24/09

Fingerprint

Data Mining
Data mining
Genes
Breast Neoplasms
Gene Expression
Gene expression
Prescriptions
Learning
Learning algorithms
Neoplasms
Therapeutics
Datasets

Keywords

  • Breast cancer prognosis
  • Data mining
  • Gene expression

ASJC Scopus subject areas

  • Information Systems
  • Biomedical Engineering
  • Health Informatics

Cite this

Giarratana, G., Pizzera, M., Masseroli, M., Medico, E., & Lanzi, P. L. (2009). Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis. In Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009 (pp. 295-300). [5211265] https://doi.org/10.1109/BIBE.2009.37

Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis. / Giarratana, Gabriele; Pizzera, Marco; Masseroli, Marco; Medico, Enzo; Lanzi, Pier Luca.

Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009. 2009. p. 295-300 5211265.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Giarratana, G, Pizzera, M, Masseroli, M, Medico, E & Lanzi, PL 2009, Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis. in Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009., 5211265, pp. 295-300, 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009, Taichung, Taiwan, Province of China, 6/22/09. https://doi.org/10.1109/BIBE.2009.37
Giarratana G, Pizzera M, Masseroli M, Medico E, Lanzi PL. Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis. In Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009. 2009. p. 295-300. 5211265 https://doi.org/10.1109/BIBE.2009.37
Giarratana, Gabriele ; Pizzera, Marco ; Masseroli, Marco ; Medico, Enzo ; Lanzi, Pier Luca. / Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis. Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009. 2009. pp. 295-300
@inproceedings{eb5d063e03984445900198f3d4be1f3c,
title = "Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis",
abstract = "Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to be used in the clinical practice to support therapy prescription. We randomly subdivided a gene expression dataset of 311 samples into a training set to learn the model and a test set to validate the model and assess its performance. We evaluated several learning algorithms in their not weighted and weighted form, which we defined to take into account the different clinical importance of false positive and false negative classifications. Based on our results, these last, especially when used in their combined form, appear to provide better results.",
keywords = "Breast cancer prognosis, Data mining, Gene expression",
author = "Gabriele Giarratana and Marco Pizzera and Marco Masseroli and Enzo Medico and Lanzi, {Pier Luca}",
year = "2009",
doi = "10.1109/BIBE.2009.37",
language = "English",
isbn = "9780769536569",
pages = "295--300",
booktitle = "Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009",

}

TY - GEN

T1 - Data mining techniques for the identification of genes with expression levels related to breast cancer prognosis

AU - Giarratana, Gabriele

AU - Pizzera, Marco

AU - Masseroli, Marco

AU - Medico, Enzo

AU - Lanzi, Pier Luca

PY - 2009

Y1 - 2009

N2 - Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to be used in the clinical practice to support therapy prescription. We randomly subdivided a gene expression dataset of 311 samples into a training set to learn the model and a test set to validate the model and assess its performance. We evaluated several learning algorithms in their not weighted and weighted form, which we defined to take into account the different clinical importance of false positive and false negative classifications. Based on our results, these last, especially when used in their combined form, appear to provide better results.

AB - Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to be used in the clinical practice to support therapy prescription. We randomly subdivided a gene expression dataset of 311 samples into a training set to learn the model and a test set to validate the model and assess its performance. We evaluated several learning algorithms in their not weighted and weighted form, which we defined to take into account the different clinical importance of false positive and false negative classifications. Based on our results, these last, especially when used in their combined form, appear to provide better results.

KW - Breast cancer prognosis

KW - Data mining

KW - Gene expression

UR - http://www.scopus.com/inward/record.url?scp=70449345858&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449345858&partnerID=8YFLogxK

U2 - 10.1109/BIBE.2009.37

DO - 10.1109/BIBE.2009.37

M3 - Conference contribution

AN - SCOPUS:70449345858

SN - 9780769536569

SP - 295

EP - 300

BT - Proceedings of the 2009 9th IEEE International Conference on Bioinformatics and BioEngineering, BIBE 2009

ER -