Research projects and interests.

Broadly speaking, I’m interested in developing statistical methods that leverages prior biological knowledge for machine learning and statistical analysis in the context of high dimensional genomic data sets. I hope to deploy these tools to help assist in generating meaningful mechanistically driven hypotheses from population-level surveys.

My work so far has been focused on microbiome research, where I seek to identify microbial functional groups and utilize them to enrich the interpretation of microbiome-outcome analyses.

First-author manuscripts

Associations between the gut microbiome and the metabolome in early life
Quang P Nguyen, Margaret R. Karagas, Juliette C. Madan, Erika Dade, Tom J. Palys, Hilary G. Morrison, Wimal W. Pathmasiri, Susan McRitche, Susan J. Sumner, H. Robert Frost, Anne G. Hoen
[Source Code] [Manuscript]
Abstract: The infant intestinal microbiome plays an important role in metabolism and immune development with impacts on lifelong health. The linkage between the taxonomic composition of the microbiome and its metabolic phenotype is undefined and complicated by redundancies in the taxon-function relationships within microbial communities. To inform a more mechanistic understanding of the relationship between the microbiome and health, we performed an integrative statistical and machine learning-based analysis of microbe taxonomic structure and metabolic function in order to characterize the taxa-function relationship in early life.

CBEA: Competitive isometric log-ratio for taxonomic enrichment analysis
Quang P Nguyen, Anne G. Hoen, H. Robert Frost
Under Review
[Analysis Code] [Package] [Pre-print]
Abstract: The study of human associated microbiomes relies on genomic surveys via high-throughput sequencing. However, microbiome taxonomic data is sparse and high dimensional which prevents the application of standard statistical techniques. One approach to address this problem is to perform analyses at the level of taxon sets. Set-based analysis has a long history in the genomics literature, with demonstrable impact in improving both power and interpretability. Unfortunately, there is limited interest in developing new set-based tools tailored for microbiome taxonomic data given its unique features compared to other ‘omics data types. We developed a new tool to generate taxon set enrichment scores at the sample level through a novel log-ratio formulation based on the competitive null hypothesis. Our scores can be used for statistical inference at both the sample and population levels, as well as inputs to other downstream analyses such as prediction models. We demonstrate the performance of our method against competing approaches across both real data analyses and simulation studies.

Evaluating trait-based databases for taxonomic enrichment analysis
Quang P Nguyen, Anne G. Hoen, H. Robert Frost
In prep
Abstract: Microbiomes exhibit a high degree of variation in taxonomic composition between individuals while maintaining relatively stable functional profiles, which suggests that microbes found to be differentially abundant might not correspond to shifts in functional outcomes in a one-to-one fashion. Therefore, incorporating microbial ecological roles in population-level microbiome association analyses can increase the interpretability of results and uncover the underlying mechanistic interactions between the microbiome and its host. These roles can be obtained at the species level via trait databases, which aggregates experimentally determined microbial features from both automatic and hand-curated sources. Here, we evaluate leveraging trait databases to construct relevant functionally driven taxon sets and then test for enrichment using taxonomic compositions with set-based methods.

Co-author manuscripts

Rebecca M Lebeaux, Juliette C Madan, Quang P Nguyen, Modupe O Coker, Erika F Dade, Yuka Moroishi, Thomas J Palys, Benjamin D Ross, Melinda M Pettigrew, Hilary G Morrison, Margaret R Karagas, Anne G Hoen. Impact of antibiotics to off-target infant gut microbiota and resistance genes in cohort studies. medRxiv, 2021. doi:

Jie Zhou, Anne G Hoen, Susan Mcritchie, Wimal Pathmasiri, Weston D Viles, Quang P Nguyen, Juliette C Madan, Erika Dade, Margaret R Karagas, Jiang Gui. Information enhanced model selection for Gaussian graphical model with application to metabolomic data. Biostatistics, 2021;, kxab006,

Robert A. Shumsky, Laurens Debo, Rebecca M. Lebeaux, Quang P. Nguyen, Anne G. Hoen. Retail store customer flow and COVID-19 transmission. Proceedings of the National Academy of Sciences Mar 2021, 118 (11) e2019225118; DOI: 10.1073/pnas.2019225118

Winterbottom, E.F., Moroishi, Y., Halchenko, Y., (incl. Nguyen Q.P.). Prenatal arsenic exposure alters the placental expression of multiple epigenetic regulators in a sex-dependent manner. Environ Health 18, 18 (2019).

Nguyen HL, Ha DA, Goldberg RJ, Kiefe CI, Chiriboga G, Ly HN, et al. (incl. Nguyen, Q.P.) (2018) Culturally adaptive storytelling intervention versus didactic intervention to improve hypertension control in Vietnam- 12 month follow up results: A cluster randomized controlled feasibility trial. PLoS ONE 13(12): e0209912.

Nguyen, H.L., Allison, J.J., Ha, D.A. et al (incl. Nguyen, Q.P.). Culturally adaptive storytelling intervention versus didactic intervention to improve hypertension control in Vietnam: a cluster-randomized controlled feasibility trial. Pilot Feasibility Stud 3, 22 (2017).