TY - JOUR
T1 - External validation of radiomics-based predictive models in low-dose CT screening for early lung cancer diagnosis
AU - Garau, Noemi
AU - Paganelli, Chiara
AU - Summers, Paul
AU - Choi, Wookjin
AU - Alam, Sadegh
AU - Lu, Wei
AU - Fanciullo, Cristiana
AU - Bellomi, Massimo
AU - Baroni, Guido
AU - Rampinelli, Cristiano
N1 - Funding Information:
The work was supported by AIRC (Associazione Italiana per la Ricerca contro il Cancro, grant number IG2018 – 21701), the Italian Ministry of Health with Ricerca Corrente and 5x1000 funds. Prof. Lu W., Dr. Choi W. and Dr. Alam S. would like to thank the NIH/NCI grant R01 CA172638 and the NIH/NCI Cancer Center Support Grant P30 CA008748.
Publisher Copyright:
© 2020 American Association of Physicists in Medicine
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/9/1
Y1 - 2020/9/1
N2 - Purpose: Low-dose CT screening allows early lung cancer detection, but is affected by frequent false positive results, inter/intra observer variation and uncertain diagnoses of lung nodules. Radiomics-based models have recently been introduced to overcome these issues, but limitations in demonstrating their generalizability on independent datasets are slowing their introduction to clinic. The aim of this study is to evaluate two radiomics-based models to classify malignant pulmonary nodules in low-dose CT screening, and to externally validate them on an independent cohort. The effect of a radiomics features harmonization technique is also investigated to evaluate its impact on the classification of lung nodules from a multicenter data. Methods: Pulmonary nodules from two independent cohorts were considered in this study; the first cohort (110 subjects, 113 nodules) was used to train prediction models, and the second cohort (72 nodules) to externally validate them. Literature-based radiomics features were extracted and, after feature selection, used as predictive variables in models for malignancy identification. An in-house prediction model based on artificial neural network (ANN) was implemented and evaluated, along with an alternative model from the literature, based on a support vector machine (SVM) classifier coupled with a least absolute shrinkage and selection operator (LASSO). External validation was performed on the second cohort to evaluate models’ generalization ability. Additionally, the impact of the Combat harmonization method was investigated to compensate for multicenter datasets variabilities. A new training of the models based on harmonized features was performed on the first cohort, then tested separately on the harmonized and non-harmonized features of the second cohort. Results: Preliminary results showed a good accuracy of the investigated models in distinguishing benign from malignant pulmonary nodules with both sets of radiomics features (i.e., non-harmonized and harmonized). The performance of the models, quantified in terms of Area Under the Curve (AUC), was > 0.89 in the training set and > 0.82 in the external validation set for all the investigated scenarios, outperforming the clinical standard (AUC of 0.76). Slightly higher performance was observed for the SVM-LASSO model than the ANN in the external dataset, although they did not result significantly different. For both harmonized and non-harmonized features, no statistical difference was found between Receiver operating characteristic (ROC) curves related to training and test set for both models. Conclusions: Although no significant improvements were observed when applying the Combat harmonization method, both in-house and literature-based models were able to classify lung nodules with good generalization to an independent dataset, thus showing their potential as tools for clinical decision-making in lung cancer screening.
AB - Purpose: Low-dose CT screening allows early lung cancer detection, but is affected by frequent false positive results, inter/intra observer variation and uncertain diagnoses of lung nodules. Radiomics-based models have recently been introduced to overcome these issues, but limitations in demonstrating their generalizability on independent datasets are slowing their introduction to clinic. The aim of this study is to evaluate two radiomics-based models to classify malignant pulmonary nodules in low-dose CT screening, and to externally validate them on an independent cohort. The effect of a radiomics features harmonization technique is also investigated to evaluate its impact on the classification of lung nodules from a multicenter data. Methods: Pulmonary nodules from two independent cohorts were considered in this study; the first cohort (110 subjects, 113 nodules) was used to train prediction models, and the second cohort (72 nodules) to externally validate them. Literature-based radiomics features were extracted and, after feature selection, used as predictive variables in models for malignancy identification. An in-house prediction model based on artificial neural network (ANN) was implemented and evaluated, along with an alternative model from the literature, based on a support vector machine (SVM) classifier coupled with a least absolute shrinkage and selection operator (LASSO). External validation was performed on the second cohort to evaluate models’ generalization ability. Additionally, the impact of the Combat harmonization method was investigated to compensate for multicenter datasets variabilities. A new training of the models based on harmonized features was performed on the first cohort, then tested separately on the harmonized and non-harmonized features of the second cohort. Results: Preliminary results showed a good accuracy of the investigated models in distinguishing benign from malignant pulmonary nodules with both sets of radiomics features (i.e., non-harmonized and harmonized). The performance of the models, quantified in terms of Area Under the Curve (AUC), was > 0.89 in the training set and > 0.82 in the external validation set for all the investigated scenarios, outperforming the clinical standard (AUC of 0.76). Slightly higher performance was observed for the SVM-LASSO model than the ANN in the external dataset, although they did not result significantly different. For both harmonized and non-harmonized features, no statistical difference was found between Receiver operating characteristic (ROC) curves related to training and test set for both models. Conclusions: Although no significant improvements were observed when applying the Combat harmonization method, both in-house and literature-based models were able to classify lung nodules with good generalization to an independent dataset, thus showing their potential as tools for clinical decision-making in lung cancer screening.
KW - low-dose CT screening
KW - lung nodules classification
KW - radiomics
UR - http://www.scopus.com/inward/record.url?scp=85087163532&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087163532&partnerID=8YFLogxK
U2 - 10.1002/mp.14308
DO - 10.1002/mp.14308
M3 - Article
C2 - 32488865
AN - SCOPUS:85087163532
VL - 47
SP - 4125
EP - 4136
JO - Medical Physics
JF - Medical Physics
SN - 0094-2405
IS - 9
ER -