A data quality assurance process to improve the precision of analysis of routinely collected administrative data for the NHS (National Health Service) UK

Robert M Cook; Alisen Dube; Md Asaduzzaman; Tim Beales; Ross Pearce; Luke Blackwell; Claire Whitehouse; Joshua Miller; Malcolm Gough; Mark Radford; Alison Leary; Sarahjane Jones

doi:10.1177/14604582251334338

A data quality assurance process to improve the precision of analysis of routinely collected administrative data for the NHS (National Health Service) UK

Health Informatics J. 2025 Apr-Jun;31(2):14604582251334338. doi: 10.1177/14604582251334338. Epub 2025 May 21.

Authors

Affiliations

¹ University of Staffordshire, Centre for Health Innovation, Stafford, UK.
² Department of Engineering, School of Digital, Technology, Innovation and Business, University of Staffordshire, Stoke-on-Trent, UK.
³ James Paget University Hospitals, Great Yarmouth, UK.
⁴ West Midlands Ambulance Service, Brierley Hill, UK.
⁵ NHS England, London, UK.
⁶ London Southbank University, London, UK.

PMID: 40398885
DOI: 10.1177/14604582251334338

Abstract

Objective: This paper demonstrates a data quality assurance (DQA) process as a means to identify and handle flaws in data, and hence improve the accuracy of an investigation into the prevalence of harmful versus non-harmful/near-miss incident reports in a single NHS acute provider.Methods: The three-step DQA process consists of an initial univariate data quality analysis, followed by a bivariate missingness analysis, and concluding with the design of appropriate multiple imputation techniques. With data quality established, the acuity and incident data were aggregated and aligned to the Ward-Month level for the period August 2015 to December 2020 inclusive. The final analysis was performed using binary regression, pooling results via Reuben's Rule.Results: The application of our three-step quality assurance process was able to detect and correct for common data quality issues. The resulting analysis identified a Ward dependency for the effect of Covid-19 lockdown measures on incident reporting culture which would have been missed without the applied imputation strategy.Conclusions: Our approach outlines a replicable methodology for understanding and fixing data quality issues in operational data. As daily operational decisions are being guided by data, it is important to leverage appropriate imputation techniques and ensure an optimal decision is reached.

Keywords: data quality; missing data; routinely collected data.

MeSH terms

COVID-19 / epidemiology
Data Accuracy*
Humans
Routinely Collected Health Data*
SARS-CoV-2
State Medicine* / organization & administration
State Medicine* / statistics & numerical data
United Kingdom