Developing and validating an accelerated online Canadian training in data science for the health sciences community

Digit Health. 2025 May 14:11:20552076251343792. doi: 10.1177/20552076251343792. eCollection 2025 Jan-Dec.

Abstract

Background: With the rise of digital health, proficiency in data science is becoming increasingly important. University curricula often lack adequate data analysis and programming education, hence the need to develop an accessible training platform.

Methods: A free, accelerated R programming course was developed for healthcare trainees with no prior programming experience. The first module was composed of seven video capsules over 10 days to teach foundational R programming for clinical research. A pretest and posttest study assessed participants' skills pretraining, immediately posttraining, and three months later. Participants (students, researchers, professors) were recruited from Montreal's academic healthcare community. A Real-Time Delphi method guided test development and mixed-effects models compared scores.

Results: Of 102 enrolled participants, 100 were analyzed, which were mostly aged 20-30 (72%) and medical students (92%, 69.6%). Of them, 84% successfully completed the course within 10 days (95% CI [77%-91%]). Mean test scores increased from 4.5/10 pretest (95% CI [4.1-4.8]) to 8.4/10 posttraining (95% CI [8.1-8.7]) (p < .001; Cohen's d = 2.5), with scores at three months (6.8/10, 95% CI [6.4-7.2]) remaining significantly higher than baseline (p < .001), despite a slight expected decline.

Conclusion: This accelerated R programming course effectively improves data science skills in healthcare trainees with no prior knowledge. It addresses key gaps in formal data science education with the potential to enhance independent research and analysis skills in complement to university curricula.

Keywords: R; Training program; data science; health science; medical education; programming; university students.