goal-driven-td3-nav

Summary

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo (TD3 + PyTorch). The project uses a simulated Velodyne-like LiDAR and trains a TD3 agent to navigate a Pioneer P3DX robot to random goals while avoiding obstacles. Trained/tested with ROS Noetic on Ubuntu 20.04, Python 3.8.10 and PyTorch 1.10.

Paper

Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning — Reinis Cimurs, Il Hong Suh, Jin Han Lee. IEEE RA-L, 2022.

Installation (commands)

Clone the repository

cd ~
# replace <your-repo-url> with your repo
git clone https://github.com/infinityengi/goal-driven-td3-nav.git
cd DRL-robot-navigation

Build and run Docker image

# Either use provided helper
docker build -t drl-robot-nav .  # builds the image (script documents details)
./run.sh       # runs the container and starts helper entrypoints

Compile the ROS workspace (inside container or on host where ROS Noetic is installed)

cd ~/DRL-robot-navigation/catkin_ws
catkin_make_isolated
source devel_isolated/setup.bash

Environment variables (these are also set by the run.sh script; repeat if needed)

export ROS_HOSTNAME=localhost
export ROS_MASTER_URI=http://localhost:11311
export ROS_PORT_SIM=11311
export GAZEBO_RESOURCE_PATH=~/DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch
source ~/.bashrc
cd ~/DRL-robot-navigation/catkin_ws
source devel_isolated/setup.bash

Training & testing (commands)

Run training (start from repository root or TD3 folder)

cd ~/DRL-robot-navigation/TD3
python3 train_velodyne_td3.py

Monitor training with TensorBoard

cd ~/DRL-robot-navigation/TD3
tensorboard --logdir runs
# Open the browser at the address printed by tensorboard (usually http://localhost:6006)

Stop training

Preferred: press Ctrl+C in the terminal running the training script.
If training processes hang or you need a forced-kill:

killall -9 rosout roslaunch rosmaster gzserver nodelet robot_state_publisher gzclient python python3

Test a trained model

cd ~/DRL-robot-navigation/TD3
python3 test_velodyne_td3.py

Simulation speed (how to increase)

Adjust <real_time_update_rate> in your Gazebo world file to run simulation faster than real time. Example file location in this project:

catkin_ws/src/multi_robot_scenario/launch/TD3.world

Increasing the real_time_update_rate speeds simulation but may destabilize sensors/plugins if set too high. Try conservative values and test.

Gazebo GUI / Visualization

The training is set to launch RViz by default (lightweight). Gazebo GUI (gzclient) is not launched by default to save GPU resources.
To open Gazebo GUI for a running simulation: open a new terminal and run gzclient.
To launch GUI automatically, edit the empty_world.launch in catkin_ws/src/multi_robot_scenario/launch and enable the GUI node (the launch file contains comments where to change this behavior).

Velodyne sensor configuration pointers

Sensor configuration lives in the Velodyne xacro/URDF and the robot xacro where the plugin is called.

Files to check and tune:

catkin_ws/src/velodyne_simulator/velodyne_description/urdf/VLP-16.urdf.xacro — change sample count, min/max angle, etc.
catkin_ws/src/multi_robot_scenario/xacro/p3dx/pioneer3dx.xacro — where the Velodyne plugin is included; there you can set FOV, frequency and origin.

Notes:

Field of View (FOV) is given in radians (left to right). If you need rear sensing, expand the FOV.
Increase frequency or sample count to get denser scans, but be mindful of CPU/GPU and simulation stability.

TD3

TD3 (Twin Delayed DDPG) is an actor-critic method for continuous control. It uses two critic networks to reduce Q-value overestimation and delays policy updates.
In this robotics context: the actor outputs continuous linear and angular velocity commands; the critics estimate the Q-value of state-action pairs.
Observations are laser/LiDAR readings (Velodyne), optionally concatenated with goal polar coordinates.

Overall

TD3 trains an actor network that outputs continuous commands (linear and angular velocities) and two critic networks that estimate the expected return. Using a replay buffer and delayed policy updates, the agent learns to reach randomly placed goals while avoiding obstacles using LiDAR observations.

Training artifacts and recommended files to keep in repo

assets/training/loss_plot.png — add your loss plot here and reference it in the README near the training section.
networks/td3_architecture.png or td3_architecture.svg — visual diagram of your actor/critic networks.
training.gif — keep at repo root so it appears at the top of the README (as requested).

Issue reporting

Use the issue template under .github/ISSUE_TEMPLATE/bug_report.md to collect environment details and reproduction steps. Example fields to request from reporters:

OS, ROS distro, Python and PyTorch versions
Exact command that failed
TensorBoard screenshot or runs/ folder
Small reproduction (launch file, small bag, or minimal steps)

Useful references and tutorials

Tutorial series by the original author (installation, environment, training):
TD3 algorithm overview and background: search for “TD3 paper” (Twin Delayed DDPG) and the OpenAI/spinningup resources for algorithm intuition.
Velodyne simulator used: https://github.com/lmark1/velodyne_simulator
IEEE paper and citation on IEEE Xplore: https://ieeexplore.ieee.org/document/9645287

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
TD3		TD3
assets		assets
catkin_ws/src		catkin_ws/src
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
env1.png		env1.png
run.sh		run.sh
velodyne.png		velodyne.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

goal-driven-td3-nav

Summary

Paper

Installation (commands)

Training & testing (commands)

Simulation speed (how to increase)

Gazebo GUI / Visualization

Velodyne sensor configuration pointers

TD3

Training artifacts and recommended files to keep in repo

Issue reporting

Useful references and tutorials

About

Uh oh!

Releases

Packages

Languages

License

infinityengi/goal-driven-td3-nav

Folders and files

Latest commit

History

Repository files navigation

goal-driven-td3-nav

Summary

Paper

Installation (commands)

Training & testing (commands)

Simulation speed (how to increase)

Gazebo GUI / Visualization

Velodyne sensor configuration pointers

TD3

Training artifacts and recommended files to keep in repo

Issue reporting

Useful references and tutorials

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages