Simplified molecular input line entry system (SMILES) as an alternative for constructing quantitative structure-property relationships (QSPR)

Andrey A. Toropov, Alla P. Toropova, Dilya V. Mukhamedzhanova, Ivan Gutman

Research output: Contribution to journalArticle

Abstract

Flexible descriptors calculated with correlation weights of fragments in the SMILES notation of molecular systems have been used as a tool for modeling normal boiling points of acyclic carbonyl compounds. Four variants of the Optimization of Correlation Weights of SMILES Fragments (OCWSF) have been examined. The difference between them is in the number of symbols in the SMILES fragments. Thus, fragments involving one-, two-, three-, and four-symbols have been examined. Correlation weights for three calculable features of SMILES are used in the OCWSF scheme: number of oxygen atoms (NO), number of double bonds (NDB), and (NO - NDB +10). In order to take into account the hydrogen bond interactions, correlation weights of these three features have been included in the OCWSF scheme. The best OCWSF model is based on three-symbol fragments together with the mentioned three features of the SMILES notation. Its statistical characteristics are: n=100, r2=0.9795, s=5.35°C, F=4673 (training set); n=100, r 2=0.9764, s=5.38°C, F=4055 (test set).

Original languageEnglish
Pages (from-to)1545-1552
Number of pages8
JournalIndian Journal of Chemistry - Section A Inorganic, Physical, Theoretical and Analytical Chemistry
Volume44
Issue number8
Publication statusPublished - Aug 2005

ASJC Scopus subject areas

  • Chemistry(all)

Fingerprint Dive into the research topics of 'Simplified molecular input line entry system (SMILES) as an alternative for constructing quantitative structure-property relationships (QSPR)'. Together they form a unique fingerprint.

  • Cite this