TY - JOUR
T1 - Application of data mining in a cohort of Italian subjects undergoing myocardial perfusion imaging at an academic medical center
AU - Ricciardi, Carlo
AU - Cantoni, Valeria
AU - Improta, Giovanni
AU - Iuppariello, Luigi
AU - Latessa, Imma
AU - Cesarelli, Mario
AU - Triassi, Maria
AU - Cuocolo, Alberto
N1 - Publisher Copyright:
© 2020
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6
Y1 - 2020/6
N2 - Introduction: Coronary artery disease (CAD) is still one of the primary causes of death in the developed countries. Stress single-photon emission computed tomography is used to evaluate myocardial perfusion and ventricular function in patients with suspected or known CAD. This study sought to test data mining and machine learning tools and to compare some supervised learning algorithms in a large cohort of Italian subjects with suspected or known CAD who underwent stress myocardial perfusion imaging. Methods: The dataset consisted of 10,265 patients with suspected or known CAD. The analysis was conducted using Knime analytics platform in order to implement Random Forests, C4.5, Gradient boosted tree, Naïve Bayes, and K nearest neighbor (KNN) after a procedure of features filtering. K-fold cross-validation was employed. Results: Accuracy, error, precision, recall, and specificity were computed through the above-mentioned algorithms. Random Forests and gradients boosted trees obtained the highest accuracy (>95%), while it was comprised between 83% and 88%. The highest value for sensitivity and specificity was obtained by C4.5 (99.3%) and by Gradient boosted tree (96.9%). Naïve Bayes had the lowest precision (70.9%) and specificity (72.0%), KNN the lowest recall and sensitivity (79.2%). Conclusions: The high scores obtained by the implementation of the algorithms suggests health facilities consider the idea of including services of advanced data analysis to help clinicians in decision-making. Similar applications of this kind of study in other contexts could support this idea.
AB - Introduction: Coronary artery disease (CAD) is still one of the primary causes of death in the developed countries. Stress single-photon emission computed tomography is used to evaluate myocardial perfusion and ventricular function in patients with suspected or known CAD. This study sought to test data mining and machine learning tools and to compare some supervised learning algorithms in a large cohort of Italian subjects with suspected or known CAD who underwent stress myocardial perfusion imaging. Methods: The dataset consisted of 10,265 patients with suspected or known CAD. The analysis was conducted using Knime analytics platform in order to implement Random Forests, C4.5, Gradient boosted tree, Naïve Bayes, and K nearest neighbor (KNN) after a procedure of features filtering. K-fold cross-validation was employed. Results: Accuracy, error, precision, recall, and specificity were computed through the above-mentioned algorithms. Random Forests and gradients boosted trees obtained the highest accuracy (>95%), while it was comprised between 83% and 88%. The highest value for sensitivity and specificity was obtained by C4.5 (99.3%) and by Gradient boosted tree (96.9%). Naïve Bayes had the lowest precision (70.9%) and specificity (72.0%), KNN the lowest recall and sensitivity (79.2%). Conclusions: The high scores obtained by the implementation of the algorithms suggests health facilities consider the idea of including services of advanced data analysis to help clinicians in decision-making. Similar applications of this kind of study in other contexts could support this idea.
KW - Analytics platform
KW - Cardiology
KW - Data mining
KW - Decision-making
KW - Myocardial perfusion imaging
UR - http://www.scopus.com/inward/record.url?scp=85078089775&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078089775&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2020.105343
DO - 10.1016/j.cmpb.2020.105343
M3 - Article
C2 - 31981760
AN - SCOPUS:85078089775
VL - 189
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
SN - 0169-2607
M1 - 105343
ER -