Skip to content

powertj/baxter_dqn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

baxter_dqn

Work in progress

Reinforcement learning in a simulated environment for the control of Baxter robot manipulator.

BaxterEnv.lua interfaces with the Atari DQN to provide a custom environment conforming to the following API. Passes a resized 4x60x60 tensor from the simulator into the DQN, consisting of an RGB image and a 4th channel containing motor position information, and in return passes commands back to simulator.

A coloured sphere, cylinder or box is spawned at a random orientation at start and reset. The baxter robot attempts to navigate it's arm to pick up the object. Currently movement on the arm is limited to a rotation at the wrist and shoulder, as well as the ability to extend the reach while forcing the gripper to be facing downwards.

An attempt to pickup the object results in termination, as unsuccessful attempts often throw the object out of reach. The success of the task is gauged by checking that the pose of the object is approxiamtely the pose of the end-effector at the end of the pickup action. A partial reward is given if the robot comes into contact with the object. At termination the environment is reset.

Requirements

Installation

Clone Atari, place BaxterNet.lua and BaxterEnv.lua in Atari folder.

Place baxter_dqn_ros package in ros workspace alongside baxter simulator installation.

While in ros workspace, rebuild by running

source ./devel./setup.bash
catkin_make
catkin_make install

Use

Launch baxter_gazebo with

./baxter.sh sim
roslaunch baxter_gazebo baxter_world.launch

Once loaded, in a new terminal run

source ./devel/setup.bash
rosrun baxter_dqn_ros torch_control.py

Navigate to the Atari directory and run th main.lua -env BaxterEnv -modelBody BaxterNet -doubleQ false -duel false -bootstraps 0 -PALpha 0

Known issues

The origin of the objects is not always at their centre. This results in some variation in pose with random orientation. This means that it is sometimes not possible to pick up the object - this should be solved when increasing number of actions.

Acknowledgements

Kai Arulkumaran for his assistance

Reference

Network architecture based on: Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection Sergey Levine et al. 2016

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published