By optimization of so-called correlation weights of attributes of simplified molecular input-line entry system (SMILES) quantitative structure – activity relationships (QSAR) for toxicity towards Pimephales promelas are established. A new SMILES attribute has been utilized in this work. This attribute is a molecular descriptor, which reflects (i) presence of different kinds of bonds (double, triple, and stereo chemical bonds); (ii) presence of nitrogen, oxygen, sulphur, and phosphorus atoms; and (iii) presence of fluorine, chlorine, bromine, and iodine atoms. The statistical characteristics of the best model are the following: n = 226, r2 = 0.7630, RMSE = 0.654 (training set); n = 114, r2 = 0.7024, RMSE = 0.766 (calibration set); n = 226, r2 = 0.6292, RMSE = 0.870 (validation set). A new criterion to select a preferable split into the training and validation sets are suggested and discussed.
- CORAL software
- Monte carlo method
- Pimephales promelas
ASJC Scopus subject areas
- Health, Toxicology and Mutagenesis