New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds

Azadi Golbamaki, Emilio Benfenati, Nazanin Golbamaki, Alberto Manganaro, Erinc Merdivan, Alessandra Roncaglioni, Giuseppina Gini

Research output: Contribution to journalArticlepeer-review


ABSTRACT: In this study, new molecular fragments associated with genotoxic and nongenotoxic carcinogens are introduced to estimate the carcinogenic potential of compounds. Two rule-based carcinogenesis models were developed with the aid of SARpy: model R (from rodents' experimental data) and model E (from human carcinogenicity data). Structural alert extraction method of SARpy uses a completely automated and unbiased manner with statistical significance. The carcinogenicity models developed in this study are collections of carcinogenic potential fragments that were extracted from two carcinogenicity databases: the ANTARES carcinogenicity dataset with information from bioassay on rats and the combination of ISSCAN and CGX datasets, which take into accounts human-based assessment. The performance of these two models was evaluated in terms of cross-validation and external validation using a 258 compound case study dataset. Combining R and H predictions and scoring a positive or negative result when both models are concordant on a prediction, increased accuracy to 72% and specificity to 79% on the external test set. The carcinogenic fragments present in the two models were compared and analyzed from the point of view of chemical class. The results of this study show that the developed rule sets will be a useful tool to identify some new structural alerts of carcinogenicity and provide effective information on the molecular structures of carcinogenic chemicals.

Original languageEnglish
Pages (from-to)97-113
Number of pages17
JournalJournal of Environmental Science and Health - Part C Environmental Carcinogenesis and Ecotoxicology Reviews
Issue number2
Publication statusPublished - Apr 2 2016


  • Carcinogenicity, QSAR, structural alerts, SARpy, in silico, molecular structures

ASJC Scopus subject areas

  • Health, Toxicology and Mutagenesis
  • Cancer Research


Dive into the research topics of 'New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds'. Together they form a unique fingerprint.

Cite this