TY - JOUR
T1 - Comparison of Bioinformatics Pipelines and Operating Systems for the Analyses of 16S rRNA Gene Amplicon Sequences in Human Fecal Samples
AU - Marizzoni, Moira
AU - Gurry, Thomas
AU - Provasi, Stefania
AU - Greub, Gilbert
AU - Lopizzo, Nicola
AU - Ribaldi, Federica
AU - Festari, Cristina
AU - Mazzelli, Monica
AU - Mombelli, Elisa
AU - Salvatore, Marco
AU - Mirabelli, Peppino
AU - Franzese, Monica
AU - Soricelli, Andrea
AU - Frisoni, Giovanni B.
AU - Cattaneo, Annamaria
N1 - Funding Information:
Funding. This study was funded by Ricerca Corrente (Italian Ministry of Health), Clinical Research Center (Geneva University Hospitals and Faculty of Medicine, Geneva) and donations from: APRA ? Association Suisse pour la Recherche sur l?Alzheimer, Geneva; Mr. Ivan Pictet, Geneva; Segre Foundation, Geneva; Velux Foundation, Zurich; Edmond J. Safra Foundation, Geneva; and anonymous donors.
Publisher Copyright:
© Copyright © 2020 Marizzoni, Gurry, Provasi, Greub, Lopizzo, Ribaldi, Festari, Mazzelli, Mombelli, Salvatore, Mirabelli, Franzese, Soricelli, Frisoni and Cattaneo.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6/17
Y1 - 2020/6/17
N2 - Amplicon high-throughput sequencing of 16S ribosomal RNA (rRNA) gene is currently the most widely used technique to investigate complex gut microbial communities. Microbial identification might be influenced by several factors, including the choice of bioinformatic pipelines, making comparisons across studies difficult. Here, we compared four commonly used pipelines (QIIME2, Bioconductor, UPARSE and mothur) run on two operating systems (OS) (Linux and Mac), to evaluate the impact of bioinformatic pipeline and OS on the taxonomic classification of 40 human stool samples. We applied the SILVA 132 reference database for all the pipelines. We compared phyla and genera identification and relative abundances across the four pipelines using the Friedman rank sum test. QIIME2 and Bioconductor provided identical outputs on Linux and Mac OS, while UPARSE and mothur reported only minimal differences between OS. Taxa assignments were consistent at both phylum and genus level across all the pipelines. However, a difference in terms of relative abundance was identified for all phyla (p < 0.013) and for the majority of the most abundant genera (p < 0.028), such as Bacteroides (QIIME2: 24.5%, Bioconductor: 24.6%, UPARSE-linux: 23.6%, UPARSE-mac: 20.6%, mothur-linux: 22.2%, mothur-mac: 21.6%, p < 0.001). The use of different bioinformatic pipelines affects the estimation of the relative abundance of gut microbial community, indicating that studies using different pipelines cannot be directly compared. A harmonization procedure is needed to move the field forward.
AB - Amplicon high-throughput sequencing of 16S ribosomal RNA (rRNA) gene is currently the most widely used technique to investigate complex gut microbial communities. Microbial identification might be influenced by several factors, including the choice of bioinformatic pipelines, making comparisons across studies difficult. Here, we compared four commonly used pipelines (QIIME2, Bioconductor, UPARSE and mothur) run on two operating systems (OS) (Linux and Mac), to evaluate the impact of bioinformatic pipeline and OS on the taxonomic classification of 40 human stool samples. We applied the SILVA 132 reference database for all the pipelines. We compared phyla and genera identification and relative abundances across the four pipelines using the Friedman rank sum test. QIIME2 and Bioconductor provided identical outputs on Linux and Mac OS, while UPARSE and mothur reported only minimal differences between OS. Taxa assignments were consistent at both phylum and genus level across all the pipelines. However, a difference in terms of relative abundance was identified for all phyla (p < 0.013) and for the majority of the most abundant genera (p < 0.028), such as Bacteroides (QIIME2: 24.5%, Bioconductor: 24.6%, UPARSE-linux: 23.6%, UPARSE-mac: 20.6%, mothur-linux: 22.2%, mothur-mac: 21.6%, p < 0.001). The use of different bioinformatic pipelines affects the estimation of the relative abundance of gut microbial community, indicating that studies using different pipelines cannot be directly compared. A harmonization procedure is needed to move the field forward.
KW - 16S rRNA amplicon sequencing
KW - bioconductor
KW - fecal human samples
KW - microbiome
KW - mothur
KW - QIIME2
KW - UPARSE
UR - http://www.scopus.com/inward/record.url?scp=85087452341&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087452341&partnerID=8YFLogxK
U2 - 10.3389/fmicb.2020.01262
DO - 10.3389/fmicb.2020.01262
M3 - Article
AN - SCOPUS:85087452341
VL - 11
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
SN - 1664-302X
M1 - 1262
ER -