Evaluating the impact of the Modifiable Areal Unit Problem on ecological model inference: A case study of COVID-19 data in Queensland, Australia

Infect Dis Model. 2025 May 10;10(3):1002-1019. doi: 10.1016/j.idm.2025.05.003. eCollection 2025 Sep.

Abstract

Accurate identification of spatial patterns and risk factors of disease occurrence is crucial for public health interventions. However, the Modifiable Areal Unit Problem (MAUP) poses challenges in disease modelling by impacting the reliability of statistical inferences drawn from spatially aggregated data. This study examines the effect of MAUP on ecological model inference using locally and overseas-acquired COVID-19 case data from 2020 to 2023 in Queensland, Australia. Bayesian spatial Besag-York-Mollié (BYM) models were applied across four Statistical Area (SA) levels, as defined by the Australian Statistical Geography Standard, with and without covariates: Socio-Economic Indexes for Areas (SEIFA) and overseas-acquired (OA) COVID-19 cases. OA COVID-19 cases were also considered a response variable in our study. Results indicated that finer spatial scales (SA1 and SA2) captured localized patterns and significant spatial autocorrelation, while coarser levels (SA3 and SA4) smoothed spatial variability, masking potential outbreak clusters. Incorporating SEIFA as a covariate in locally-acquired (LA) cases reduced spatial autocorrelation in residuals, effectively capturing socioeconomic disparities. Conversely, OA cases showed limited effectiveness in reducing autocorrelation at finer scales. For LA cases, higher socioeconomic disadvantage was associated with increased COVID-19 incidence at finer scales, but this association became non-significant at coarser scales. OA cases showed significant positive association with higher SEIFA scores at finer scales. Model parameters displayed narrower credible intervals at finer scales, indicating greater precision, while coarser levels had increased uncertainty. SA2 emerged as an arguably optimal scale, striking a balance between spatial resolution, model stability, and interpretability. To improve inference on COVID-19 incidence, it is recommended to use data from both SA1 and SA2 levels to leverage their respective strengths. The findings emphasize the importance of selecting appropriate spatial scales and covariates or evaluating the inferential impacts of multiple scales, to address MAUP to facilitate more reliable spatial analysis. The study advocates exploring intermediate aggregation levels and multi-scale approaches to better capture nuanced disease dynamics and extend these analyses across Australia and replicating in other countries with low population densities to enhance generalizability.

Keywords: Bayesian models; COVID-19; Model inference; Modifiable Areal Unit Problem (MAUP); Socio-Economic Indexes for Areas (SEIFA); Spatial patterns.