Correlating High-dimensional longitudinal microbial features with time-varying outcomes with FLORAL

bioRxiv [Preprint]. 2025 Feb 19:2025.02.17.638558. doi: 10.1101/2025.02.17.638558.

Abstract

Correlating time-dependent patient characteristics and matched microbiome samples can be helpful to identify biomarkers in longitudinal microbiome studies. Existing approaches typically repeat a pre-specified modeling approach for all taxonomic features, followed by a multiple testing adjustment step for false discovery rate (FDR) control. In this work, we develop an alternative strategy of using log-ratio penalized generalized estimating equations, which directly models the longitudinal patient characteristic of interest as the outcome variable and treats microbial features as high-dimensional compositional covariates. A cross validation procedure is developed for variable selection and model selection among different working correlation structures. In extensive simulations, the proposed method achieved superior sensitivity over the state-of-the-art methods with robustly controlled FDR. In the analyses of correlating longitudinal dietary intake and microbial features from matched samples of cancer patients, the proposed method effectively identified gut health indicators and clinically relevant microbial markers, showing robust utilities in real-world applications. The method is implemented under the open-source R package FLORAL, which is available at (https://vdblab.github.io/FLORAL/).

Publication types

  • Preprint