Comparison of SMILES and molecular graphs as the representation of the molecular structure for QSAR analysis for mutagenic potential of polyaromatic amines

A. A. Toropov, A. P. Toropova, S. E. Martyanov, E. Benfenati, G. Gini, D. Leszczynska, J. Leszczynski

Research output: Contribution to journalArticle

Abstract

Optimal descriptors calculated with simplified molecular input line entry system (SMILES), hydrogen-suppressed molecular graph (HSG), hydrogen-filled molecular graph (HFG), and graph of atomic orbitals (GAO) have been studied as a basis to build up models for mutagenicity of polyaromatic amines. The optimal descriptors are calculated with correlation weights of the molecular fragments. In the case of the molecular graph, chemical elements (C, N, O, etc.) or their electronic structure (1s2, 2p3, 3d10, etc.) together with their Morgan vertex degrees are the basis for calculation of the descriptor. In the case of SMILES, chemical elements (C, O, N, etc.) together with presence of cycles (1, 2, 3, etc.), cis-, trans- isomerism ('\' and '/') and other are the basis for calculation of the descriptor. In both these cases, descriptors are a mathematical function of the correlation weights of the above-mentioned molecular features. The correlation weights are calculated by the Monte Carlo optimization (the target function is the correlation coefficient between experimental and predicted endpoint values). SMILES-based optimal descriptors have shown the preferable predictive ability. The CORAL software (http://www.insilico.eu/coral/) was used to build up models of the mutagenic potential as the function of the molecular structure. Analysis of three probes of the Monte Carlo optimization with six random splits has shown there are three kinds of the molecular features encoded by SMILES attributes: promoters of increase/decrease of mutagenic potential and ones without defined role.

Original languageEnglish
Pages (from-to)94-100
Number of pages7
JournalChemometrics and Intelligent Laboratory Systems
Volume109
Issue number1
DOIs
Publication statusPublished - Nov 15 2011

Keywords

  • Monte Carlo method
  • Mutagenicity
  • Optimal descriptor
  • QSAR

ASJC Scopus subject areas

  • Analytical Chemistry
  • Computer Science Applications
  • Software
  • Process Chemistry and Technology
  • Spectroscopy

Fingerprint Dive into the research topics of 'Comparison of SMILES and molecular graphs as the representation of the molecular structure for QSAR analysis for mutagenic potential of polyaromatic amines'. Together they form a unique fingerprint.

  • Cite this