drl_practice

Practice the Deep Reinforcement Learning (DRL) with the gymnasium.

How to practice

Check the Command Guide for the step-by-step commands:

Create the conda env with pip.
Exercise
1. For a exercise, implement all NotImplementedErrors in the *_exercise.py file .
2. then train it with the provided command.
3. [Optional] generate the video and push the video/result to the HuggingFace.

Don't choose too hard game and big neural network. But you can try it by yourself.

Exercise	Algorithm	Verification Game	For Challenge	State	Action
1. q_learning	Q Table	FrozenLake	Taxi	📊	📊
2. dqn	Deep Q Network -> Rainbow	1D LunarLander-v3	img LunarLander-v3	🌊	📊
3. reinforce	Reinforce (Monte Carlo)	CartPole-v1	-	🌊	📊
4. curiosity	Curiosity (Reinforce, baseline, shaping reward)	-	MountainCar-v0	🌊	📊
5. A2C	A2C+GAE (or A2C+TD-n)	CartPole-v1	LunarLander-v3	🌊	📊
6. A3C	A3C (using A2C+GAE)	CartPole-v1	LunarLander-v3	🌊	📊
7. PPO	PPO	CartPole-v1	LunarLander-v3	🌊	📊
8. TD3	Twin Delayed DDPG (TD3)	Pendulum-v1	Walker2d-v5	🌊	🌊
9. SAC	SAC (Soft Actor-Critic)	Pendulum-v1	Walker2d-v5	🌊	🌊
10. PPO+DDP	PPO+Curiosity	Reacher-v5	Pusher-v5	🌊	🌊
11. SAC+DDP	SAC+PER	Reacher-v5	Pusher-v5	🌊	🌊
12. MBPO	Model-based Policy Optim.	Pusher-v5	Walker2d-v5	🌊	🌊

where, 🌊: Continuous, 📊: Discrete

After studying the HuggingFace's DRL course and Pieter Abbeel's The Foundations of Deep RL in 6 Lectures, I want to have a deeper and broader understanding through the coding.

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
.github/workflows		.github/workflows
practice		practice
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt