Background: Persistent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection differs from long coronavirus disease (COVID-19) (acute symptoms ≥ 12 weeks post-clearance). The Omicron BA.5 variant has a shorter median clearance time (10-14 days) than the Delta variant, suggesting that the traditional 20-day diagnostic threshold may delay interventions in high-risk populations. This study integrated multi-threshold analysis (14/20/30 days), whole-genome sequencing, and machine learning to investigate diagnostic thresholds for persistent SARS-CoV-2 infection and developed a generalizable risk prediction model.
Methods: This retrospective study analyzed data from 1,216 patients with COVID-19 hospitalized at Aerospace Center Hospital between January 2021 and October 2024. We used whole-genome sequencing to genotype all COVID-19 cases and to identify major variants (such as Omicron BA. 5, Delta). The outcome, "persistent SARS-CoV-2 infection," was defined as viral nucleic acid positivity ≥ 14 days. Risk factors associated with persistent infection were identified through subgroup analysis with multiple logistic regression (adjusted for age, comorbidities, vaccination status, and virus strain) and machine learning models (70% training, 30% testing dataset).
Results: Persistent SARS-CoV-2 infection was identified in 15.5% (188/1,216) of hospitalized COVID-19 patients. Key predictors included comorbidities-hypertension, diabetes, and active malignancy-and immune dysfunction, marked by reduced B-cell and CD4 + T-cell counts. Unvaccinated patients exhibited an 82% higher risk of persistent infection. Elevated inflammatory markers (C-reactive protein and interleukin-6) and bilateral lung infiltrates on computed tomography further distinguished persistent cases. The predictive model demonstrated strong discrimination with an area under the curve (AUC) of 0.847 (95% confidence interval: 0.815-0.879) and an AUC of 0.81 externally in external validation, underscoring its clinical utility for risk stratification.
Conclusions: Hypertension, diabetes, malignancy, immunosuppression (low B/CD4 + cells), and non-vaccination are independent risk factors for persistent SARS-CoV-2 infection. Integrating these factors into clinical risk stratification may optimize management of high-risk populations.
Keywords: Clinical manifestations; Persistent infection; Predictive model construction; Risk factors; SARS-CoV-2.
© 2025. The Author(s).