Hospital outpatient volume is influenced by a variety of factors, including environmental conditions and healthcare resource availability. Accurate prediction of outpatient demand can significantly enhance operational efficiency and optimize the allocation of medical resources. This study aims to develop a predictive model for daily hospital outpatient volume using the XGBoost algorithm. Meanwhile, the forecasting performance was compared with that of the Seasonal AutoRegressive Integrated Moving Average with exogenous regressors (SARIMAX) and Random Forest (RF) models. The dataset comprises daily climate data (e.g., temperature, precipitation, PM2.5 levels), historical outpatient volume records, and the number of outpatient specialists available each day. The data range involved spans from January 1, 2014, to October 31, 2024. Data preprocessing involved addressing missing values and encoding categorical variables. Model performance was assessed using three metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) , Mean Absolute Percentage Error (MAPE), and R-squared (R2) metrics. The XGBoost model exhibited superior predictive accuracy compared to both the SARIMAX and RF models, with the lowest MAE, RMSE, MAPE, and the highest R2, successfully capturing key relationships between climate factors, resource availability, and outpatient volume. The number of outpatient specialists, temporal variables (such as year, quarter, month, and weekday), meteorological conditions (average temperature), and air quality (PM2.5) had the most significant impact on the prediction model. This study underscores the potential of machine learning algorithms like XGBoost in effectively predicting hospital outpatient demand. The findings offer valuable insights for hospitals to make proactive adjustments to their resource allocation, thereby improving their service capacity.
Keywords: Climate data; Hospital resource planning; Machine learning; Outpatient volume; Predictive analytics; XGBoost.
© 2025. The Author(s).