Background Machine learning (ML) is able to extract patterns and develop algorithms to construct data-driven models. We use ML models to gain insight into the relative importance of variables to predict obstructive coronary artery disease (CAD) using the Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) study, as well as to compare prediction of obstructive CAD to the CAD consortium clinical score (CAD2). We further perform ML analysis to gain insight into the role of imaging and clinical variables for revascularization. Methods For prediction of obstructive CAD, the entire ICA arm of the study, comprising 719 patients was used. For revascularization, 1,028 patients were randomized to invasive coronary angiography (ICA) or coronary computed tomographic angiography (CCTA). Data was randomly split into 80% training 20% test sets for building and validation. Models used extreme gradient boosting (XGBoost). Results Mean age was 60.6 ± 11.5 years and 64.3% were female. For the prediction of obstructive CAD, the AUC was significantly higher for ML at 0.779 (95% CI: 0.672–0.886) than for CAD2 (0.696 [95% CI: 0.594–0.798]) (P = 0.01). BMI, age, and angina severity were the most important variables. For revascularization, the model obtained an overall area under the receiver-operation curve (AUC) of 0.958 (95% CI = 0.933–0.983). Performance did not differ whether the imaging parameters used were from ICA (AUC 0.947, 95% CI = 0.903–0.990) or CCTA (AUC 0.941, 95% CI = 0.895–0.988) (P = 0.90). The ML model obtained sensitivity and specificity of 89.2% and 92.9%, respectively. Number of vessels with ≥70% stenosis, maximum segment stenosis severity (SSS) and body mass index (BMI) were the most important variables. Exclusion of imaging variables resulted in performance deterioration, with an AUC of 0.705 (95% CI 0.614–0.795) (P <0.0001). Conclusions For obstructive CAD, the ML model outperformed CAD2. BMI is an important variable, although currently not included in most scores. In this ML model, imaging variables were most associated with revascularization. Imaging modality did not influence model performance. Removal of imaging variables reduced model performance.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)