Xinyan Zhang
Himel Mallick
Nengjun Yi


Motivation: The human microbiome plays an important role in human health and disease. The composition of the human microbiome is influenced by multiple factors and understanding these factors is critical to elucidate the role of the microbiome in health and disease and for development of new diagnostics or therapeutic targets based on the microbiome. 16S ribosomal RNA (rRNA) gene targeted amplicon sequencing is a commonly used approach to determine the taxonomic composition of the bacterial community. Operational taxonomic units (OTUs) are clustered based on generated sequence reads and used to determine whether and how the abundance of microbiome is correlated with some characteristics of the samples, such as health/disease status, smoking status, or dietary habit. However, OTU count data is not only overdispersed but also contains an excess number of zero counts due to undersampling. Efficient analytical tools are therefore needed for downstream statistical analysis which can simultaneously account for overdispersion and sparsity in microbiome data.


