Clustering breast cancer data by consensus of different validity indices

D. Soria, J. M. Garibaldi, F. Ambrogi, P. J G Lisboa, P. Boracchi, E. Biganzoli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number.

Original languageEnglish
Title of host publicationIET Conference Publications
Edition540 CP
DOIs
Publication statusPublished - 2008
Event4th IET International Conference on Advances in Medical, Signal and Information Processing, MEDSIP 2008 - Santa Margherita Ligure, Italy
Duration: Jul 14 2008Jul 16 2008

Other

Other4th IET International Conference on Advances in Medical, Signal and Information Processing, MEDSIP 2008
CountryItaly
CitySanta Margherita Ligure
Period7/14/087/16/08

Keywords

  • Breast cancer
  • Clustering algorithms
  • Validity indices

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Clustering breast cancer data by consensus of different validity indices'. Together they form a unique fingerprint.

Cite this