ZERO-INFLATED NEGATIVE BINOMIAL REGRESSION FOR DIFFERENTIAL ABUNDANCE TESTING IN MICROBIOME STUDIES

Main Article Content

Xinyan Zhang
Himel Mallick
Nengjun Yi

Abstract

Motivation: The human microbiome plays an important role in human health and disease. The composition of the human microbiome is influenced by multiple factors and understanding these factors is critical to elucidate the role of the microbiome in health and disease and for development of new diagnostics or therapeutic targets based on the microbiome. 16S ribosomal RNA (rRNA) gene targeted amplicon sequencing is a commonly used approach to determine the taxonomic composition of the bacterial community. Operational taxonomic units (OTUs) are clustered based on generated sequence reads and used to determine whether and how the abundance of microbiome is correlated with some characteristics of the samples, such as health/disease status, smoking status, or dietary habit. However, OTU count data is not only overdispersed but also contains an excess number of zero counts due to undersampling. Efficient analytical tools are therefore needed for downstream statistical analysis which can simultaneously account for overdispersion and sparsity in microbiome data.

Metrics

Metrics Loading ...

Article Details

How to Cite
Zhang, X., Mallick, H., & Yi, N. (2016). ZERO-INFLATED NEGATIVE BINOMIAL REGRESSION FOR DIFFERENTIAL ABUNDANCE TESTING IN MICROBIOME STUDIES. JOURNAL OF BIOINFORMATICS AND GENOMICS, (2 (2). https://doi.org/10.18454/jbg.2016.2.2.1
Section
Novel computational tools and databases

References

Anders, S., & Huber, W. (2010). Differential expression analysis for sequence count data. Genome Biol, 11(10), R106. doi:10.1186/gb-2010-11-10-r106

Biagi, E., Nylund, L., Candela, M., Ostan, R., Bucci, L., Pini, E., . . . De Vos, W. (2010). Through ageing, and beyond: gut microbiota and inflammatory status in seniors and centenarians. PLoS One, 5(5), e10667. doi:10.1371/journal.pone.0010667

Charlson, E. S., Bittinger, K., Haas, A. R., Fitzgerald, A. S., Frank, I., Yadav, A., . . . Collman, R. G. (2011). Topographical continuity of bacterial populations in the healthy human respiratory tract. Am J Respir Crit Care Med, 184(8), 957-963. doi:10.1164/rccm.201104-0655OC

Cho, I., & Blaser, M. J. (2012). The human microbiome: at the interface of health and disease. Nat Rev Genet, 13(4), 260-270. doi:10.1038/nrg3182

Collison, M., Hirt, R. P., Wipat, A., Nakjang, S., Sanseau, P., & Brown, J. R. (2012). Data mining the human gut microbiota for therapeutic targets. Brief Bioinform, 13(6), 751-768. doi:10.1093/bib/bbs002

De Filippo, C., Cavalieri, D., Di Paola, M., Ramazzotti, M., Poullet, J. B., Massart, S., . . . Lionetti, P. (2010). Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proc Natl Acad Sci U S A, 107(33), 14691-14696. doi:10.1073/pnas.1005963107

Dethlefsen, L., McFall-Ngai, M., & Relman, D. A. (2007). An ecological and evolutionary perspective on human-microbe mutualism and disease. Nature, 449(7164), 811-818. doi:10.1038/nature06245

Dominguez-Bello, M. G., Costello, E. K., Contreras, M., Magris, M., Hidalgo, G., Fierer, N., & Knight, R. (2010). Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc Natl Acad Sci U S A, 107(26), 11971-11975. doi:10.1073/pnas.1002601107

Frank, D. N., St Amand, A. L., Feldman, R. A., Boedeker, E. C., Harpaz, N., & Pace, N. R. (2007). Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci U S A, 104(34), 13780-13785. doi:10.1073/pnas.0706625104

Gao, Z., Guo, B., Gao, R., Zhu, Q., & Qin, H. (2015). Microbiota disbiosis is associated with colorectal cancer. Front Microbiol, 6, 20. doi:10.3389/fmicb.2015.00020

Ghodsi, M., Liu, B., & Pop, M. (2011). DNACLUST: accurate and efficient clustering of phylogenetic marker genes. BMC Bioinformatics, 12, 271. doi:10.1186/1471-2105-12-271

Gilbert, J. A., Meyer, F., & Bailey, M. J. (2011). The future of microbial metagenomics (or is ignorance bliss?). ISME J, 5(5), 777-779. doi:10.1038/ismej.2010.178

Haddow, L. J., Mulgrew, C., Ansari, A., Miell, J., Jackson, G., Malnick, H., & Rao, G. G. (2003). Neisseria elongata endocarditis: case report and literature review. Clin Microbiol Infect, 9(5), 426-430.

Holmes, E., Li, J. V., Athanasiou, T., Ashrafian, H., & Nicholson, J. K. (2011). Understanding the role of gut microbiome-host metabolic signal disruption in health and disease. Trends Microbiol, 19(7), 349-359. doi:10.1016/j.tim.2011.05.006

Hugenholtz, P. (2002). Exploring prokaryotic diversity in the genomic era. Genome Biol, 3(2), REVIEWS0003.

Knights, D., Parfrey, L. W., Zaneveld, J., Lozupone, C., & Knight, R. (2011). Human-associated microbial signatures: examining their predictive value. Cell Host Microbe, 10(4), 292-296. doi:10.1016/j.chom.2011.09.003

Kostic, A. D., Gevers, D., Pedamallu, C. S., Michaud, M., Duke, F., Earl, A. M., . . . Meyerson, M. (2012). Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. Genome Res, 22(2), 292-298. doi:10.1101/gr.126573.111

Li, J., Witten, D. M., Johnstone, I. M., & Tibshirani, R. (2012). Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 13(3), 523-538. doi:10.1093/biostatistics/kxr031

Mallick, H., & Tiwari, H. K. (2016). EM Adaptive LASSO-A Multilocus Modeling Strategy for Detecting SNPs Associated with Zero-inflated Count Phenotypes. Front Genet, 7, 32. doi:10.3389/fgene.2016.00032

Matsen, F. A., Kodner, R. B., & Armbrust, E. V. (2010). pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics, 11, 538. doi:10.1186/1471-2105-11-538

McMurdie, P. J., & Holmes, S. (2014). Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol, 10(4), e1003531. doi:10.1371/journal.pcbi.1003531

Nagaoka, K., Yanagihara, K., Morinaga, Y., Nakamura, S., Harada, T., Hasegawa, H., . . . Kohno, S. (2014). Prevotella intermedia induces severe bacteremic pneumococcal pneumonia in mice with upregulated platelet-activating factor receptor expression. Infect Immun, 82(2), 587-593. doi:10.1128/IAI.00943-13

Paulson, J. N., Stine, O. C., Bravo, H. C., & Pop, M. (2013). Differential abundance analysis for microbial marker-gene surveys. Nat Methods, 10(12), 1200-1202. doi:10.1038/nmeth.2658

Peng, X., Li, G., & Liu, Z. (2015). Zero-Inflated Beta Regression for Differential Abundance Analysis with Metagenomics Data. J Comput Biol. doi:10.1089/cmb.2015.0157

Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139-140. doi:10.1093/bioinformatics/btp616

Robinson, M. D., & Oshlack, A. (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol, 11(3), R25. doi:10.1186/gb-2010-11-3-r25

Rogers, G. B., Carroll, M. P., Serisier, D. J., Hockey, P. M., Jones, G., & Bruce, K. D. (2004). characterization of bacterial community diversity in cystic fibrosis lung infections by use of 16s ribosomal DNA terminal restriction fragment length polymorphism profiling. J Clin Microbiol, 42(11), 5176-5183. doi:10.1128/JCM.42.11.5176-5183.2004

Samuel, B. S., & Gordon, J. I. (2006). A humanized gnotobiotic mouse model of host-archaeal-bacterial mutualism. Proc Natl Acad Sci U S A, 103(26), 10011-10016. doi:10.1073/pnas.0602187103

Sato, T., Tomida, J., Naka, T., Fujiwara, N., Hasegawa, A., Hoshikawa, Y., . . . Kawamura, Y. (2015). Porphyromonas bronchialis sp. nov. Isolated from Intraoperative Bronchial Fluids of a Patient with Non-Small Cell Lung Cancer. Tohoku J Exp Med, 237(1), 31-37. doi:10.1620/tjem.237.31

Sears, C. L., & Pardoll, D. M. (2011). Perspective: alpha-bugs, their microbial partners, and the link to colon cancer. J Infect Dis, 203(3), 306-311. doi:10.1093/jinfdis/jiq061

Segata, N., Izard, J., Waldron, L., Gevers, D., Miropolsky, L., Garrett, W. S., & Huttenhower, C. (2011). Metagenomic biomarker discovery and explanation. Genome Biol, 12(6), R60. doi:10.1186/gb-2011-12-6-r60

Sohn, M. B., Du, R., & An, L. (2015). A robust approach for identifying differentially abundant features in metagenomic samples. Bioinformatics, 31(14), 2269-2275. doi:10.1093/bioinformatics/btv165

Spor, A., Koren, O., & Ley, R. (2011). Unravelling the effects of the environment and host genotype on the gut microbiome. Nat Rev Microbiol, 9(4), 279-290. doi:10.1038/nrmicro2540

Turnbaugh, P. J., Hamady, M., Yatsunenko, T., Cantarel, B. L., Duncan, A., Ley, R. E., . . . Gordon, J. I. (2009). A core gut microbiome in obese and lean twins. Nature, 457(7228), 480-484. doi:10.1038/nature07540

Turnbaugh, P. J., Ley, R. E., Hamady, M., Fraser-Liggett, C. M., Knight, R., & Gordon, J. I. (2007). The human microbiome project. Nature, 449(7164), 804-810. doi:10.1038/nature06244

Turnbaugh, P. J., Ley, R. E., Mahowald, M. A., Magrini, V., Mardis, E. R., & Gordon, J. I. (2006). An obesity-associated gut microbiome with increased capacity for energy harvest. Nature, 444(7122), 1027-1031. doi:10.1038/nature05414

Velculescu, V. E., Zhang, L., Vogelstein, B., & Kinzler, K. W. (1995). Serial analysis of gene expression. Science, 270(5235), 484-487.

Virgin, H. W., & Todd, J. A. (2011). Metagenomics and personalized medicine. Cell, 147(1), 44-56. doi:10.1016/j.cell.2011.09.009

Wagner, B. D., Robertson, C. E., & Harris, J. K. (2011). Application of two-part statistics for comparison of sequence variant counts. PLoS One, 6(5), e20296. doi:10.1371/journal.pone.0020296

Wang, T., Cai, G., Qiu, Y., Fei, N., Zhang, M., Pang, X., . . . Zhao, L. (2012). Structural segregation of gut microbiota between colorectal cancer patients and healthy volunteers. ISME J, 6(2), 320-329. doi:10.1038/ismej.2011.109

White, J. R., Nagarajan, N., & Pop, M. (2009). Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput Biol, 5(4), e1000352. doi:10.1371/journal.pcbi.1000352

Whitman, W. B., Coleman, D. C., & Wiebe, W. J. (1998). Prokaryotes: the unseen majority. Proc Natl Acad Sci U S A, 95(12), 6578-6583.

Winstead, J. M., McKinsey, D. S., Tasker, S., De Groote, M. A., & Baddour, L. M. (2000). Meningococcal pneumonia: characterization and review of cases seen over the past 25 years. Clin Infect Dis, 30(1), 87-94. doi:10.1086/313617

Wooley, J. C., & Ye, Y. (2009). Metagenomics: Facts and Artifacts, and Computational Challenges*. J Comput Sci Technol, 25(1), 71-81. doi:10.1007/s11390-010-9306-4

Wollowski, I., Rechkemmer, G., & Pool-Zobel, B. L. (2001). Protective role of probiotics and prebiotics in colon cancer. Am J Clin Nutr, 73(2 Suppl), 451S-455S.

Wu, G. D., Chen, J., Hoffmann, C., Bittinger, K., Chen, Y. Y., Keilbaugh, S. A., . . . Lewis, J. D. (2011). Linking long-term dietary patterns with gut microbial enterotypes. Science, 334(6052), 105-108. doi:10.1126/science.1208344

Ze, X., Duncan, S. H., Louis, P., & Flint, H. J. (2012). Ruminococcus bromii is a keystone species for the degradation of resistant starch in the human colon. ISME J, 6(8), 1535-1543. doi:10.1038/ismej.2012.4