Hey, this is my first project I am publishing and first time using GitHub outside of CS50, if you have any questions feel free to let me know!
This project demonstrates a reinforcement learning approach, specifically Q-learning, to guide an agent through a maze. The agent learns an optimal policy for navigating the maze from a starting position to a goal position.
├── maze.py # Maze class definition (Environment)
├── q_learning.py # QLearningAgent class definition (Agent)
├── train.py # Training and testing logic
└── visualize.py # Functions for visualization
-
Environment: The
maze.pyfile defines theMazeclass, which represents the maze environment. This environment provides the agent with states and rewards, negative for hitting walls or taking steps. -
Agent: The
q_learning.pyfile implements theQLearningAgentclass, the core of the reinforcement learning solution. The agent utilizes a Q-table to represent the value function, which estimates the expected future reward for taking specific actions in different states. The agent learns through:- Exploration: Trying out random actions to gather information about the environment (using an exploration rate that decays over time).
- Exploitation: Selecting actions that lead to the highest expected reward based on the current Q-table estimates.
- Q-table Update: Update Q-values based on observed rewards and the discounted future rewards of subsequent states.
-
Training: The
train.pyfile manages the training process. It creates theMazeenvironment, initializes theQLearningAgent, and runs the training loop. The agent learns over multiple episodes, accumulating experience and refining its policy. -
Visualization: The
visualize.pyfile provides functions for:- Visualizing the Maze: Displaying the maze environment with the starting and goal locations.
- Tracking the Agent's Path: Animating the agent's movement through the maze during training and testing, allowing you to observe its learning progress.
- Install Dependencies:
pip install numpy matplotlib
You can customize aspects of the maze environment, rewards, learning rate, exploration parameters, and other settings within the train.py file.
The first bug I found is that when you close the Python window too fast it causes a blank graph to show up.
More Complex Environments: I try to have the agent solve different shaped mazes and add obstacles. Advanced RL Algorithms: Experiment with other reinforcement learning algorithms, such as Deep Q-Networks for handling more complex state spaces or SARSA for different update rules. Applications in my other projects: I will use the Q-Learning technique I implemented in this project to others in the future
Awesome friends and youtube resources