Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm

Alberto Manganaro, Fabiola Pizzo, Anna Lombardo, Alberto Pogliaghi, Emilio Benfenati

Research output: Contribution to journalArticlepeer-review

Abstract

The ability of a substance to resist degradation and persist in the environment needs to be readily identified in order to protect the environment and human health. Many regulations require the assessment of persistence for substances commonly manufactured and marketed. Besides laboratory-based testing methods, in silico tools may be used to obtain a computational prediction of persistence. We present a new program to develop k-Nearest Neighbor (. k-NN) models. The k-NN algorithm is a similarity-based approach that predicts the property of a substance in relation to the experimental data for its most similar compounds. We employed this software to identify persistence in the sediment compartment. Data on half-life (HL) in sediment were obtained from different sources and, after careful data pruning the final dataset, containing 297 organic compounds, was divided into four experimental classes. We developed several models giving satisfactory performances, considering that both the training and test set accuracy ranged between 0.90 and 0.96. We finally selected one model which will be made available in the near future in the freely available software platform VEGA. This model offers a valuable in silico tool that may be really useful for fast and inexpensive screening.

Original languageEnglish
Pages (from-to)1624-1630
Number of pages7
JournalChemosphere
Volume144
DOIs
Publication statusPublished - Feb 1 2016

Keywords

  • Half-life
  • In silico
  • PBT
  • Persistence
  • Read across
  • Sediment

ASJC Scopus subject areas

  • Environmental Chemistry
  • Chemistry(all)

Fingerprint Dive into the research topics of 'Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm'. Together they form a unique fingerprint.

Cite this