The average numbers of outliers over groups of various splits into training and test sets: A criterion of the reliability of a QSPR? A case of water solubility

Alla P. Toropova, Andrey A. Toropov, Emilio Benfenati, Giuseppina Gini, Danuta Leszczynska, Jerzy Leszczynski

Research output: Contribution to journalArticle

Abstract

The validation of quantitative structure-property/activity relationships (QSPR/QSAR) is an important challenge of modern theoretical chemistry. Analysis of QSPRs which are obtained with various distribution into sub-systems of training and of testing can be a useful approach to estimate reliability of QSPR predictions. The balance of correlation is an approach for the building up of QSPR with using three components of available data: (a) sub-training set (developer), (b) calibration set (critic), and (c) test set (estimator). Computational experiments have shown that the probabilistic interdependence between the distribution of available data into sub-training set, calibration set, and test set and the average numbers of outliers in the test set exists.

Original languageEnglish
Pages (from-to)134-137
Number of pages4
JournalChemical Physics Letters
Volume542
DOIs
Publication statusPublished - Jul 23 2012

ASJC Scopus subject areas

  • Physical and Theoretical Chemistry
  • Physics and Astronomy(all)

Fingerprint Dive into the research topics of 'The average numbers of outliers over groups of various splits into training and test sets: A criterion of the reliability of a QSPR? A case of water solubility'. Together they form a unique fingerprint.

  • Cite this