Risk of bias in nonrandomized studies of interventions showed low inter-rater reliability and challenges in its application

Silvia Minozzi, Michela Cinquini, Silvia Gianola, Greta Castellini, Chiara Gerardi, Rita Banzi

Research output: Contribution to journalArticle

Abstract

Objective: To assess the inter-rater reliability (IRR)and usability of the risk of bias in nonrandomized studies of interventions tool (ROBINS-I). Study Design and Setting: We designed a cross-sectional study. Five raters independently applied ROBINS-I to the nonrandomized cohort studies in three systematic reviews on vaccines, opiate abuse, and rehabilitation. We calculated Fleiss' Kappa for multiple raters as a measure of IRR and discussed the application of ROBINS-I to identify difficulties and possible reasons for disagreement. Results: Thirty one studies were included (195 evaluations). IRRs were slight for overall judgment (IRR 0.06, 95% CI 0.001 to 0.12)and individual domains (from 0.04, 95% CI −0.04 to 0.12 for the domain “selection of reported results” to 0.18, 95% CI 0.10 to 0.26 for the domain “deviation from intended interventions”). Mean time to apply the tool was 27.8 minutes (SD 12.6)per study. The main difficulties were due to poor reporting of primary studies, misunderstanding of the question, translation of questions into a final judgment, and incomplete guidance. Conclusion: We found ROBINS-I difficult and demanding, even for raters with substantial expertise in systematic reviews. Calibration exercises and intensive training before its application are needed to improve reliability.

Original languageEnglish
Pages (from-to)28-35
Number of pages8
JournalJournal of Clinical Epidemiology
Volume112
DOIs
Publication statusPublished - Aug 1 2019

Fingerprint

Opiate Alkaloids
Calibration
Cohort Studies
Vaccines
Rehabilitation
Cross-Sectional Studies
Exercise
insulin receptor-related receptor

Keywords

  • Inter-rater reliability
  • Nonrandomized studies
  • Risk of bias
  • ROBINS-I
  • Systematic reviews

ASJC Scopus subject areas

  • Epidemiology

Cite this

Risk of bias in nonrandomized studies of interventions showed low inter-rater reliability and challenges in its application. / Minozzi, Silvia; Cinquini, Michela; Gianola, Silvia; Castellini, Greta; Gerardi, Chiara; Banzi, Rita.

In: Journal of Clinical Epidemiology, Vol. 112, 01.08.2019, p. 28-35.

Research output: Contribution to journalArticle

@article{3cb0bee0d9d8444a80dac3353278c454,
title = "Risk of bias in nonrandomized studies of interventions showed low inter-rater reliability and challenges in its application",
abstract = "Objective: To assess the inter-rater reliability (IRR)and usability of the risk of bias in nonrandomized studies of interventions tool (ROBINS-I). Study Design and Setting: We designed a cross-sectional study. Five raters independently applied ROBINS-I to the nonrandomized cohort studies in three systematic reviews on vaccines, opiate abuse, and rehabilitation. We calculated Fleiss' Kappa for multiple raters as a measure of IRR and discussed the application of ROBINS-I to identify difficulties and possible reasons for disagreement. Results: Thirty one studies were included (195 evaluations). IRRs were slight for overall judgment (IRR 0.06, 95{\%} CI 0.001 to 0.12)and individual domains (from 0.04, 95{\%} CI −0.04 to 0.12 for the domain “selection of reported results” to 0.18, 95{\%} CI 0.10 to 0.26 for the domain “deviation from intended interventions”). Mean time to apply the tool was 27.8 minutes (SD 12.6)per study. The main difficulties were due to poor reporting of primary studies, misunderstanding of the question, translation of questions into a final judgment, and incomplete guidance. Conclusion: We found ROBINS-I difficult and demanding, even for raters with substantial expertise in systematic reviews. Calibration exercises and intensive training before its application are needed to improve reliability.",
keywords = "Inter-rater reliability, Nonrandomized studies, Risk of bias, ROBINS-I, Systematic reviews",
author = "Silvia Minozzi and Michela Cinquini and Silvia Gianola and Greta Castellini and Chiara Gerardi and Rita Banzi",
year = "2019",
month = "8",
day = "1",
doi = "10.1016/j.jclinepi.2019.04.001",
language = "English",
volume = "112",
pages = "28--35",
journal = "Journal of Clinical Epidemiology",
issn = "0895-4356",
publisher = "Elsevier USA",

}

TY - JOUR

T1 - Risk of bias in nonrandomized studies of interventions showed low inter-rater reliability and challenges in its application

AU - Minozzi, Silvia

AU - Cinquini, Michela

AU - Gianola, Silvia

AU - Castellini, Greta

AU - Gerardi, Chiara

AU - Banzi, Rita

PY - 2019/8/1

Y1 - 2019/8/1

N2 - Objective: To assess the inter-rater reliability (IRR)and usability of the risk of bias in nonrandomized studies of interventions tool (ROBINS-I). Study Design and Setting: We designed a cross-sectional study. Five raters independently applied ROBINS-I to the nonrandomized cohort studies in three systematic reviews on vaccines, opiate abuse, and rehabilitation. We calculated Fleiss' Kappa for multiple raters as a measure of IRR and discussed the application of ROBINS-I to identify difficulties and possible reasons for disagreement. Results: Thirty one studies were included (195 evaluations). IRRs were slight for overall judgment (IRR 0.06, 95% CI 0.001 to 0.12)and individual domains (from 0.04, 95% CI −0.04 to 0.12 for the domain “selection of reported results” to 0.18, 95% CI 0.10 to 0.26 for the domain “deviation from intended interventions”). Mean time to apply the tool was 27.8 minutes (SD 12.6)per study. The main difficulties were due to poor reporting of primary studies, misunderstanding of the question, translation of questions into a final judgment, and incomplete guidance. Conclusion: We found ROBINS-I difficult and demanding, even for raters with substantial expertise in systematic reviews. Calibration exercises and intensive training before its application are needed to improve reliability.

AB - Objective: To assess the inter-rater reliability (IRR)and usability of the risk of bias in nonrandomized studies of interventions tool (ROBINS-I). Study Design and Setting: We designed a cross-sectional study. Five raters independently applied ROBINS-I to the nonrandomized cohort studies in three systematic reviews on vaccines, opiate abuse, and rehabilitation. We calculated Fleiss' Kappa for multiple raters as a measure of IRR and discussed the application of ROBINS-I to identify difficulties and possible reasons for disagreement. Results: Thirty one studies were included (195 evaluations). IRRs were slight for overall judgment (IRR 0.06, 95% CI 0.001 to 0.12)and individual domains (from 0.04, 95% CI −0.04 to 0.12 for the domain “selection of reported results” to 0.18, 95% CI 0.10 to 0.26 for the domain “deviation from intended interventions”). Mean time to apply the tool was 27.8 minutes (SD 12.6)per study. The main difficulties were due to poor reporting of primary studies, misunderstanding of the question, translation of questions into a final judgment, and incomplete guidance. Conclusion: We found ROBINS-I difficult and demanding, even for raters with substantial expertise in systematic reviews. Calibration exercises and intensive training before its application are needed to improve reliability.

KW - Inter-rater reliability

KW - Nonrandomized studies

KW - Risk of bias

KW - ROBINS-I

KW - Systematic reviews

UR - http://www.scopus.com/inward/record.url?scp=85065236113&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065236113&partnerID=8YFLogxK

U2 - 10.1016/j.jclinepi.2019.04.001

DO - 10.1016/j.jclinepi.2019.04.001

M3 - Article

C2 - 30981833

AN - SCOPUS:85065236113

VL - 112

SP - 28

EP - 35

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

SN - 0895-4356

ER -