Frequentist Grouped Weighted Quantile Sum Regression for Correlated Chemical Mixtures

Stat Med. 2025 Mar 30;44(7):e70078. doi: 10.1002/sim.70078.

Abstract

As individuals are exposed to a myriad of potentially harmful pollutants every day, it is important to determine which actors have the greatest influence on health outcomes. However, jointly modeling the associations of multiple pollutant exposures is often hindered by the presence of highly correlated chemicals originating from a common source. A popular approach to analyzing associations between a disease outcome and several highly correlated exposures is Weighted Quantile Sum Regression (WQSR) modeling. WQSR provides increased stability in estimating model parameters but requires data splitting to estimate individual and group effects of chemicals, which reduces the power of the approach. A recent Bayesian implementation of WQSR regression provides a model fitting procedure that avoids data splitting at the cost of high computational expense on large data. In this paper, we introduce a Frequentist Grouped Weighted Quantile Sum Regression (FGWQSR) model that can be fitted efficiently to large datasets without requiring data splitting. FGWQSR produces estimates of the joint effect of mixture groups and of individual chemicals, and likelihood-ratio-based tests that account for FGWQSR's non-standard asymptotics. We demonstrate that FGWQSR is well calibrated for type-I errors while outperforming both Bayesian Grouped Weighted Quantile Sum Regression and Quantile Logistic Regression in terms of statistical power to detect the effects of mixture groups and individual chemicals. In addition, we show that FGWQSR is robust to model misspecification and can be fitted on large datasets in a fraction of the time required for BGWQSR. We apply FGWQSR to a dataset of 317 767 mother-child pairs with exposure profiles generated by chemical transport models to study the associations between several components found in particulate matter with an aerodynamic diameter smaller than 2.5 μ m $$ \mu \mathrm{m} $$ (PM 2 . 5 $$ {}_{2.5} $$ ) and child Autism Spectrum Disorder (ASD) diagnosis before age 5. PM 2 . 5 $$ {}_{2.5} $$ copper and PM 2 . 5 $$ {}_{2.5} $$ crustal material are found to be statistically significantly associated with ASD diagnosis by five years of age.

Keywords: autism spectrum disorder; chemical mixture modeling; constrained optimization; group sign constrained regression; non‐regular likelihood asymptotics; pollutant mixture modeling; weighted quantile sum regression.

MeSH terms

  • Autism Spectrum Disorder
  • Bayes Theorem
  • Child
  • Computer Simulation
  • Environmental Exposure* / adverse effects
  • Female
  • Humans
  • Likelihood Functions
  • Models, Statistical*
  • Pregnancy
  • Regression Analysis