A data quality assurance process to improve the precision of analysis of routinely collected administrative data for the NHS (National Health Service) UK

Health Informatics J. 2025 Apr-Jun;31(2):14604582251334338. doi: 10.1177/14604582251334338. Epub 2025 May 21.

Abstract

Objective: This paper demonstrates a data quality assurance (DQA) process as a means to identify and handle flaws in data, and hence improve the accuracy of an investigation into the prevalence of harmful versus non-harmful/near-miss incident reports in a single NHS acute provider.Methods: The three-step DQA process consists of an initial univariate data quality analysis, followed by a bivariate missingness analysis, and concluding with the design of appropriate multiple imputation techniques. With data quality established, the acuity and incident data were aggregated and aligned to the Ward-Month level for the period August 2015 to December 2020 inclusive. The final analysis was performed using binary regression, pooling results via Reuben's Rule.Results: The application of our three-step quality assurance process was able to detect and correct for common data quality issues. The resulting analysis identified a Ward dependency for the effect of Covid-19 lockdown measures on incident reporting culture which would have been missed without the applied imputation strategy.Conclusions: Our approach outlines a replicable methodology for understanding and fixing data quality issues in operational data. As daily operational decisions are being guided by data, it is important to leverage appropriate imputation techniques and ensure an optimal decision is reached.

Keywords: data quality; missing data; routinely collected data.

MeSH terms

  • COVID-19 / epidemiology
  • Data Accuracy*
  • Humans
  • Routinely Collected Health Data*
  • SARS-CoV-2
  • State Medicine* / organization & administration
  • State Medicine* / statistics & numerical data
  • United Kingdom