TY - JOUR
T1 - A new ensemble method for detecting anomalies in gene expression matrices.
AU - Selicato, Laura
AU - Esposito, Flavia
AU - Gargano, Grazia
AU - Vegliante, Maria Carmela
AU - Opinto, Giuseppina
AU - Zaccaria, Gian Maria
AU - Ciavarella, Sabino
AU - Guarini, Attilio
AU - Del Buono, Nicoletta
N1 - Funding Information:
Acknowledgments: This work was supported in part by the GNCS-INDAM (Gruppo Nazionale per il Calcolo Scientifico of Istituto Nazionale di Alta Matematica) Francesco Severi, P.le Aldo Moro, Roma, Italy.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/4/2
Y1 - 2021/4/2
N2 - One of the main problems in the analysis of real data is often related to the presence of anomalies. Namely, anomalous cases can both spoil the resulting analysis and contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. In the biomedical field, a correct identification of outliers could allow the development of new biological hypotheses that are not considered when looking at experimental biological data. In this work, we address the problem of detecting outliers in gene expression data, focusing on microarray analysis. We propose an ensemble approach for detecting anomalies in gene expression matrices based on the use of Hierarchical Clustering and Robust Principal Component Analysis, which allows us to derive a novel pseudo-mathematical classification of anomalies.
AB - One of the main problems in the analysis of real data is often related to the presence of anomalies. Namely, anomalous cases can both spoil the resulting analysis and contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. In the biomedical field, a correct identification of outliers could allow the development of new biological hypotheses that are not considered when looking at experimental biological data. In this work, we address the problem of detecting outliers in gene expression data, focusing on microarray analysis. We propose an ensemble approach for detecting anomalies in gene expression matrices based on the use of Hierarchical Clustering and Robust Principal Component Analysis, which allows us to derive a novel pseudo-mathematical classification of anomalies.
KW - Anomaly
KW - Clustering
KW - Gene expression
KW - Low rank decomposition
KW - Outliers
UR - http://www.scopus.com/inward/record.url?scp=85104672959&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104672959&partnerID=8YFLogxK
U2 - 10.3390/math9080882
DO - 10.3390/math9080882
M3 - Article
AN - SCOPUS:85104672959
VL - 9
JO - Mathematics
JF - Mathematics
SN - 2227-7390
IS - 8
M1 - 882
ER -