Using machine learning to identify subgroups with the highest expected benefit in a population-based water, sanitation, handwashing, and nutrition intervention

medRxiv [Preprint]. 2025 Jun 18:2025.06.17.25329796. doi: 10.1101/2025.06.17.25329796.

Abstract

Background: Understanding who benefits most from investments in water, sanitation, and hygiene (WaSH) interventions can elucidate causal pathways, uncover complex interactions between population characteristics and interventions, and inform targeted implementation. We applied machine learning to identify and describe households of children that benefited most from WaSH and nutrition interventions.

Methods: We used causal forests and baseline characteristics of pregnant women enroled in a trial in Bangladesh (2013-2015) to test for heterogenous treatment effects of the primary trial outcomes at two years (length-for-age Z-score [LAZ-score] and diarrhoea prevalence) and one secondary outcome (child development [EASQ Z-score]) for each treatment-outcome combination. We split households into three groups based on predicted treatment effect magnitude and compared characteristics of those that benefitted the most (Tercile 3) versus the least (Tercile 1).

Results: Heterogeneity was detected in the effect of Sanitation on EASQ Z-score, compared to Control; children in Tercile 3 were estimated to gain 0.51 SD (95% CI: 0.35, 0.67) whereas children in Tercile 1 were estimated to have no benefit. At baseline, households of children in Tercile 3 were more likely to report that chickens always entered the house (85% vs. 4%) and had animal feces observed in the child's play area (84% vs. 18%) when compared with Tercile 1. Tercile 3 households also owned less land and assets and lived further from Dhaka, any population center, or a market. We did not detect heterogeneity for any other treatment-outcome comparison.

Conclusions: We did not detect heterogeneity in any treatment arms for the outcomes of diarrhoea or LAZ-score, showing that children from all backgrounds benefit from effective interventions equally based on household characteristics. We found heterogeneity in the effect of receiving sanitation improvements on child development, where poorer households located in more remote areas and potentially with higher levels of animal fecal contamination had the highest expected benefit.

Publication types

  • Preprint