New public QSAR model for carcinogenicity

Natalja Fjodorova, Marjan Vračko, Marjana Novič, Alessandra Roncaglioni, Emilio Benfenati

Research output: Contribution to journalArticle

Abstract

Background: One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration.Results: Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91% and 89% correspondingly; the accuracy of the test set (161 compounds) was 73% and 69%, while the specificity was 69% and 61%, respectively. Sensitivity in both cases was equal to 75%. The accuracy of the leave 20% out cross validation for the training set of models A and B was equal to 66% and 62% respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4% and 60.0%, sensitivity 64.0% and 61.8% and specificity equal to 58.9% and 58.4% respectively for models A and B.Conclusion: Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions and conventional methods. However, we believe that combination of several methods will provide useful support to the overall evaluation of carcinogenicity. In present paper models for classification of carcinogenic compounds using MDL and Dragon descriptors were developed. Models could be used to set priorities among chemicals for further testing. The models at the CAESAR site were implemented in java and are publicly accessible.

Original languageEnglish
Article numberS3
JournalChemistry Central Journal
Volume4
Issue numberSUPPL. 1
DOIs
Publication statusPublished - Jul 29 2010

Fingerprint

Health
Neural networks
Testing

ASJC Scopus subject areas

  • Chemistry(all)

Cite this

New public QSAR model for carcinogenicity. / Fjodorova, Natalja; Vračko, Marjan; Novič, Marjana; Roncaglioni, Alessandra; Benfenati, Emilio.

In: Chemistry Central Journal, Vol. 4, No. SUPPL. 1, S3, 29.07.2010.

Research output: Contribution to journalArticle

Fjodorova, Natalja ; Vračko, Marjan ; Novič, Marjana ; Roncaglioni, Alessandra ; Benfenati, Emilio. / New public QSAR model for carcinogenicity. In: Chemistry Central Journal. 2010 ; Vol. 4, No. SUPPL. 1.
@article{2e7e184553b84d3795f34959c2244a02,
title = "New public QSAR model for carcinogenicity",
abstract = "Background: One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration.Results: Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91{\%} and 89{\%} correspondingly; the accuracy of the test set (161 compounds) was 73{\%} and 69{\%}, while the specificity was 69{\%} and 61{\%}, respectively. Sensitivity in both cases was equal to 75{\%}. The accuracy of the leave 20{\%} out cross validation for the training set of models A and B was equal to 66{\%} and 62{\%} respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4{\%} and 60.0{\%}, sensitivity 64.0{\%} and 61.8{\%} and specificity equal to 58.9{\%} and 58.4{\%} respectively for models A and B.Conclusion: Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions and conventional methods. However, we believe that combination of several methods will provide useful support to the overall evaluation of carcinogenicity. In present paper models for classification of carcinogenic compounds using MDL and Dragon descriptors were developed. Models could be used to set priorities among chemicals for further testing. The models at the CAESAR site were implemented in java and are publicly accessible.",
author = "Natalja Fjodorova and Marjan Vračko and Marjana Novič and Alessandra Roncaglioni and Emilio Benfenati",
year = "2010",
month = "7",
day = "29",
doi = "10.1186/1752-153X-4-S1-S3",
language = "English",
volume = "4",
journal = "Chemistry Central Journal",
issn = "1752-153X",
publisher = "Chemistry Central",
number = "SUPPL. 1",

}

TY - JOUR

T1 - New public QSAR model for carcinogenicity

AU - Fjodorova, Natalja

AU - Vračko, Marjan

AU - Novič, Marjana

AU - Roncaglioni, Alessandra

AU - Benfenati, Emilio

PY - 2010/7/29

Y1 - 2010/7/29

N2 - Background: One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration.Results: Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91% and 89% correspondingly; the accuracy of the test set (161 compounds) was 73% and 69%, while the specificity was 69% and 61%, respectively. Sensitivity in both cases was equal to 75%. The accuracy of the leave 20% out cross validation for the training set of models A and B was equal to 66% and 62% respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4% and 60.0%, sensitivity 64.0% and 61.8% and specificity equal to 58.9% and 58.4% respectively for models A and B.Conclusion: Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions and conventional methods. However, we believe that combination of several methods will provide useful support to the overall evaluation of carcinogenicity. In present paper models for classification of carcinogenic compounds using MDL and Dragon descriptors were developed. Models could be used to set priorities among chemicals for further testing. The models at the CAESAR site were implemented in java and are publicly accessible.

AB - Background: One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration.Results: Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91% and 89% correspondingly; the accuracy of the test set (161 compounds) was 73% and 69%, while the specificity was 69% and 61%, respectively. Sensitivity in both cases was equal to 75%. The accuracy of the leave 20% out cross validation for the training set of models A and B was equal to 66% and 62% respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4% and 60.0%, sensitivity 64.0% and 61.8% and specificity equal to 58.9% and 58.4% respectively for models A and B.Conclusion: Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions and conventional methods. However, we believe that combination of several methods will provide useful support to the overall evaluation of carcinogenicity. In present paper models for classification of carcinogenic compounds using MDL and Dragon descriptors were developed. Models could be used to set priorities among chemicals for further testing. The models at the CAESAR site were implemented in java and are publicly accessible.

UR - http://www.scopus.com/inward/record.url?scp=77955393756&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955393756&partnerID=8YFLogxK

U2 - 10.1186/1752-153X-4-S1-S3

DO - 10.1186/1752-153X-4-S1-S3

M3 - Article

C2 - 20678182

AN - SCOPUS:77955393756

VL - 4

JO - Chemistry Central Journal

JF - Chemistry Central Journal

SN - 1752-153X

IS - SUPPL. 1

M1 - S3

ER -