Efficient TD3 based path planning of mobile robot in dynamic environments using prioritized experience replay and LSTM

Yunhan Lin; Zhijie Zhang; Yijian Tan; Hao Fu; Huasong Min

doi:10.1038/s41598-025-02244-z

Efficient TD3 based path planning of mobile robot in dynamic environments using prioritized experience replay and LSTM

Sci Rep. 2025 May 26;15(1):18331. doi: 10.1038/s41598-025-02244-z.

Authors

Yunhan Lin^{1

2

3}, Zhijie Zhang^{1

2

3}, Yijian Tan^{1

2

3}, Hao Fu^{1

2}, Huasong Min⁴

Affiliations

¹ School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430081, China.
² Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan University of Science and Technology, Wuhan, 430081, China.
³ Institute of Robotics and Intelligent Systems, Wuhan University of Science and Technology, Wuhan, 430081, China.
⁴ Institute of Robotics and Intelligent Systems, Wuhan University of Science and Technology, Wuhan, 430081, China. mhuasong@wust.edu.cn.

Abstract

To address the challenges of sample utilization efficiency and managing temporal dependencies, this paper proposes an efficient path planning method for mobile robot in dynamic environments based on an improved twin delayed deep deterministic policy gradient (TD3) algorithm. The proposed method, named PL-TD3, integrates prioritized experience replay (PER) and long short-term memory (LSTM) neural networks, which enhance both sample efficiency and the ability to handle time-series data. To verify the effectiveness of the proposed method, simulation and practical experiments were designed and conducted. In the simulation experiments, both static and dynamic obstacles were included in the test environment, along with experiments to assess generalization capabilities. The algorithm demonstrated superior performance in terms of both execution time and path efficiency. The practical experiments, based on the assumptions from the simulation tests, further confirmed that PL-TD3 has improved the effectiveness and robustness of path planning for mobile robot in dynamic environments.

Keywords: Long short-term memory (LSTM ); Path planning in dynamic environment; Prioritized experience replay (PER); Reinforcement learning; Twin delayed deep deterministic policy gradient (TD3) algorithm.

Abstract

Grants and funding