Handling of dioxin measurement data in the presence of non-detectable values: Overview of available methods and their application in the Seveso chloracne study

Andrea Baccarelli, Ruth Pfeiffer, Dario Consonni, Angela C. Pesatori, Matteo Bonzini, Donald G. Patterson, Pier Alberto Bertazzi, Maria Teresa Landi

Research output: Contribution to journalArticle

Abstract

Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.

Original languageEnglish
Pages (from-to)898-906
Number of pages9
JournalChemosphere
Volume60
Issue number7
DOIs
Publication statusPublished - Aug 2005

Fingerprint

Chloracne
Dioxins
dioxin
Substitution reactions
Limit of Detection
Statistical methods
substitution
Plasmas
environmental research
data quality
statistical analysis
fill
software
plasma
Software
detection
distribution
method
1,4-dioxin
Research

Keywords

  • 2,3,7,8-Tetrachlorodibenzo-p-dioxin
  • Detection limit
  • Exposure assessment
  • Multiple imputation
  • Non-detects
  • Seveso

ASJC Scopus subject areas

  • Environmental Chemistry
  • Environmental Science(all)

Cite this

Handling of dioxin measurement data in the presence of non-detectable values : Overview of available methods and their application in the Seveso chloracne study. / Baccarelli, Andrea; Pfeiffer, Ruth; Consonni, Dario; Pesatori, Angela C.; Bonzini, Matteo; Patterson, Donald G.; Bertazzi, Pier Alberto; Landi, Maria Teresa.

In: Chemosphere, Vol. 60, No. 7, 08.2005, p. 898-906.

Research output: Contribution to journalArticle

@article{44aa1e9532894b9bb2488f25f5f9cf9e,
title = "Handling of dioxin measurement data in the presence of non-detectable values: Overview of available methods and their application in the Seveso chloracne study",
abstract = "Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15{\%}. Distribution-based multiple imputation methods, also known as robust or {"}fill-in{"} procedures, may produce dependable results even when 50-70{\%} of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6{\%} of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.",
keywords = "2,3,7,8-Tetrachlorodibenzo-p-dioxin, Detection limit, Exposure assessment, Multiple imputation, Non-detects, Seveso",
author = "Andrea Baccarelli and Ruth Pfeiffer and Dario Consonni and Pesatori, {Angela C.} and Matteo Bonzini and Patterson, {Donald G.} and Bertazzi, {Pier Alberto} and Landi, {Maria Teresa}",
year = "2005",
month = "8",
doi = "10.1016/j.chemosphere.2005.01.055",
language = "English",
volume = "60",
pages = "898--906",
journal = "Chemosphere",
issn = "0045-6535",
publisher = "Elsevier Limited",
number = "7",

}

TY - JOUR

T1 - Handling of dioxin measurement data in the presence of non-detectable values

T2 - Overview of available methods and their application in the Seveso chloracne study

AU - Baccarelli, Andrea

AU - Pfeiffer, Ruth

AU - Consonni, Dario

AU - Pesatori, Angela C.

AU - Bonzini, Matteo

AU - Patterson, Donald G.

AU - Bertazzi, Pier Alberto

AU - Landi, Maria Teresa

PY - 2005/8

Y1 - 2005/8

N2 - Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.

AB - Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.

KW - 2,3,7,8-Tetrachlorodibenzo-p-dioxin

KW - Detection limit

KW - Exposure assessment

KW - Multiple imputation

KW - Non-detects

KW - Seveso

UR - http://www.scopus.com/inward/record.url?scp=21244465964&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21244465964&partnerID=8YFLogxK

U2 - 10.1016/j.chemosphere.2005.01.055

DO - 10.1016/j.chemosphere.2005.01.055

M3 - Article

C2 - 15992596

AN - SCOPUS:21244465964

VL - 60

SP - 898

EP - 906

JO - Chemosphere

JF - Chemosphere

SN - 0045-6535

IS - 7

ER -