Reinforcement Learning with Proximal Policy Optimization (PPO) algorithm for the Bitcraze Crazyflie 2.1
This project implements the Proximal Policy Optimization (PPO) algorithm for the Crazyflie 2.1 drone. It includes both a simulated environment and a real-world setup to train the PPO algorithm for controlling the Crazyflie 2.1 to hover at a specific altitude. Below is an image of the Crazyflie 2.1:
- Linux Opearating System
- Anaconda Distribution (optional but recommended)
- Crazyflie 2.1
- Crazyflie Radio
- Flow deck v2
- Batteries
This project uses a Linux operating system with Python 3.9.20.
To get started, it is recommended to install Anaconda on your laptop or PC. Once Anaconda is installed, follow the steps below:
Open a terminal and run the following commands:
conda create --name crazyflie_env python=3.9.20
conda activate crazyflie_env
- Begin by upgrading the
pipmodule to the latest version. Run the following command in your terminal:
pip3 install --upgrade pip
Then, download and unzip this repository to your computer. Once unzipped, navigate to the folder. For example, if the folder is located in your Downloads directory, use the following command to change to that directory:
cd /home/[Your_Name]/Downloads/Crazyflie_RL-main
pip3 install -r requirements.txt
It is recommended to connect the wandb module for logging and experiment tracking. Follow these steps:
First, Sign Up (Create an account) on the Wandb website. Next, run the following commands in your terminal to connect with the website:
wandb login
After running wandb login, it will prompt you for your API key. You can find your API key under User Settings -> API Keys on the Wandb website. Copy the key and paste it into the terminal when prompted.
To ensure the installation was successful, run the following command in the terminal:
cfclient
If the installation is successful, you should see the following interface:
Note: If the command does not work, repeat the steps above until the issue is resolved or troubleshoot the issue.
Verify that git is installed by running:
git --version
If git is not installed, use the following commands to install it:
sudo apt update
sudo apt install git
Next, create a folder where you want to download the firmware. Use the following commands to set up the directory and download the required files: I created the folder in my Documents
cd /home/[Your_Name]/Documents
mkdir crazyflie_doc
cd crazyflie_doc
Let's follow the steps outlined in this installation guide. Let's proceed together!
sudo apt-get install make gcc-arm-none-eabi
Verify that tkinter is installed by running:
python3 -m tkinter
If tkinter is not installed, use the following commands to install it:
sudo apt install python3-tk
If the installation is successful, you should see the following interface:
Press "QUIT".
Download the Crazyflie firmware by running the following command:
git clone --recursive https://github.com/bitcraze/crazyflie-firmware.git
Verify that the firmware has been downloaded by checking the created folder (crazyflie_doc).
Run the following commands to initialize and update the submodules:
cd crazyflie-firmware
git submodule init
git submodule update
Make sure no errors encountered
Compile the firmware using:
make cf2_defconfig
make -j$(nproc)
If no errors occur, the setup is complete.
-
Connect the Crazyflie radio to your laptop or PC.
-
Turn on the Crazyflie drone
-
Verify communication between the drone and the radio:
- Run the following command in the terminal:
cfclient -
In the cfclient interface, press
scan. The interface should detect the radio and drone.- If the drone isn’t detected, enter the address manually by writing
0xE7E7E7E7E7. I changed mine, so the modified address is0xE7E7E7E704. - press
connect. - If you move the drone, you will start to see changes in the
Fligh Datapart.
- If the drone isn’t detected, enter the address manually by writing
-
We don't need this interface. The only thing we need to get from here is the Crazyflie radio and address information (e.g.
'radio://0/80/2M/E7E7E7E7E7'). We can close this interface.
Make sure cflib has been installed using command:
pip3 show cflib
If not, run the following command:
pip3 install cflib
Follow the steps given in the Step-by-Step: Connecting, logging and parameters webpage to check if the Crazyflie is connected or not. Let's proceed together by writing a python code as follows:
import logging
import time
import cflib.crtp
from cflib.crazyflie import Crazyflie
from cflib.crazyflie.syncCrazyflie import SyncCrazyflie
# URI to the Crazyflie to connect to
uri = 'radio://0/80/2M/E7E7E7E7E7'
def simple_connect():
print("Yeah, I'm connected! :D")
time.sleep(3)
print("Now I will disconnect :'(")
if __name__ == '__main__':
# Initialize the low-level drivers
cflib.crtp.init_drivers()
with SyncCrazyflie(uri, cf=Crazyflie(rw_cache='./cache')) as scf:
simple_connect()
If any error encountered, it is most probably from URI. Go back to cfclient interface and check the radio and address information and fix the code accordingly. For example, my URI is: uri = 'radio://0/100/2M/E7E7E7E704'.
If the code works properly, you will see commands below:
Yeah, I'm connected! :D
Now I will disconnect :'(
Flow Deck is a sensor that provides the detailed information about the position of the drone. We need to make sure it is connected. Run the following code:
import logging
import sys
import time
from threading import Event
import cflib.crtp
from cflib.crazyflie import Crazyflie
from cflib.crazyflie.syncCrazyflie import SyncCrazyflie
from cflib.utils import uri_helper
URI = uri_helper.uri_from_env(default='radio://0/80/2M/E7E7E7E7E7')
deck_attached_event = Event()
logging.basicConfig(level=logging.ERROR)
def param_deck_flow(_, value_str):
value = int(value_str)
print(value)
if value:
deck_attached_event.set()
print('Deck is attached!')
else:
print('Deck is NOT attached!')
if __name__ == '__main__':
cflib.crtp.init_drivers()
with SyncCrazyflie(URI, cf=Crazyflie(rw_cache='./cache')) as scf:
scf.cf.param.add_update_callback(group='deck', name='bcFlow2',
cb=param_deck_flow)
time.sleep(1)
If the FLow Deck connected properly, you will see commands below:
Deck is attached!
More detailed infomration is provided in Step-by-Step: Motion Commander.
Before assigning specific tasks, test basic motions such as take-off, forward movement, backward movement, and landing. Place the drone in a safe, open area to prevent any accidents.
Run the following code:
import logging
import sys
import time
from threading import Event
import cflib.crtp
from cflib.crazyflie import Crazyflie
from cflib.crazyflie.syncCrazyflie import SyncCrazyflie
from cflib.positioning.motion_commander import MotionCommander
from cflib.utils import uri_helper
URI = uri_helper.uri_from_env(default='radio://0/80/2M/E7E7E7E7E7')
DEFAULT_HEIGHT = 0.5
deck_attached_event = Event()
logging.basicConfig(level=logging.ERROR)
def move_linear_simple(scf):
with MotionCommander(scf, default_height=DEFAULT_HEIGHT) as mc:
time.sleep(1)
mc.forward(0.5)
time.sleep(1)
mc.back(0.5)
time.sleep(1)
mc.land()
def param_deck_flow(_, value_str):
value = int(value_str)
print(value)
if value:
deck_attached_event.set()
print('Deck is attached!')
else:
print('Deck is NOT attached!')
if __name__ == '__main__':
cflib.crtp.init_drivers()
with SyncCrazyflie(URI, cf=Crazyflie(rw_cache='./cache')) as scf:
scf.cf.param.add_update_callback(group='deck', name='bcFlow2',
cb=param_deck_flow)
time.sleep(1)
if not deck_attached_event.wait(timeout=5):
print('No flow deck detected!')
sys.exit(1)
move_linear_simple(scf)
The other commands for the motion of the Crazyflie can be found in motion_commander and commander webpages.
To check the information of the position and oriantation of the Crazyflie, we need to have log to export and analyze it.
Run the following code, we can get the x, y and z position of the Crazyflie in real-time:
import logging
import sys
import time
from threading import Event
import cflib.crtp
from cflib.crazyflie import Crazyflie
from cflib.crazyflie.log import LogConfig
from cflib.crazyflie.syncCrazyflie import SyncCrazyflie
from cflib.positioning.motion_commander import MotionCommander
from cflib.utils import uri_helper
URI = uri_helper.uri_from_env(default='radio://0/80/2M/E7E7E7E707')
DEFAULT_HEIGHT = 0.5
deck_attached_event = Event()
logging.basicConfig(level=logging.ERROR)
position_estimate = [0, 0, 0]
def move_linear_simple(scf):
with MotionCommander(scf, default_height=DEFAULT_HEIGHT) as mc:
time.sleep(1)
mc.land()
def log_pos_callback(timestamp, data, logconf):
print(data)
global position_estimate
position_estimate[0] = data['stateEstimate.x']
position_estimate[1] = data['stateEstimate.y']
position_estimate[2] = data['stateEstimate.z']
def param_deck_flow(_, value_str):
value = int(value_str)
print(value)
if value:
deck_attached_event.set()
print('Deck is attached!')
else:
print('Deck is NOT attached!')
if __name__ == '__main__':
cflib.crtp.init_drivers()
with SyncCrazyflie(URI, cf=Crazyflie(rw_cache='./cache')) as scf:
scf.cf.param.add_update_callback(group='deck', name='bcFlow2',
cb=param_deck_flow)
time.sleep(1)
logconf = LogConfig(name='Position', period_in_ms=10)
logconf.add_variable('stateEstimate.x', 'float')
logconf.add_variable('stateEstimate.y', 'float')
logconf.add_variable('stateEstimate.z', 'float')
scf.cf.log.add_config(logconf)
logconf.data_received_cb.add_callback(log_pos_callback)
if not deck_attached_event.wait(timeout=5):
print('No flow deck detected!')
sys.exit(1)
logconf.start()
move_linear_simple(scf)
logconf.stop()
Also, if interested in Roll, Pitch and Yaw orienations of the Crazyflie, replace the position information with orientation, or use both.
def log_pos_callback(timestamp, data, logconf):
print(data)
global position_estimate
position_estimate[0] = data['stabilizer.roll']
position_estimate[1] = data['stabilizer.pitch']
position_estimate[2] = data['stabilizer.yaw']
.
.
.
logconf = LogConfig(name='Stabilizer', period_in_ms=10)
logconf.add_variable('stabilizer.roll', 'float')
logconf.add_variable('stabilizer.pitch', 'float')
logconf.add_variable('stabilizer.yaw', 'float')
scf.cf.log.add_config(logconf)
logconf.data_received_cb.add_callback(log_pos_callback)
This work utilizes the Proximal Policy Optimization (PPO) algorithm in both simulated and real-world Crazyflie environments to achieve stable hovering at a specified altitude.
Download and unzip this repository. Once done, if it is in Downloads for example, write the command:
cd /home/[Your_Name]/Downloads/Crazyflie_RL-main
To train the PPO algorithm in the simulated Crazyflie environment, use the following command:
python train.py --task simulation
To train the PPO algorithm in the real-world Crazyflie environment, use this command:
python train.py --task real
To specify the desired altitude, use the following command:
python train.py --target_altitude 1.0
In the example above, the target altitude is set to 1 meter.
For safety, it is recommended to specify the maximum allowable velocity (±) for the Crazyflie's actions. Use the following command to define the range:
python train.py --action_range 0.20
In this example, the maximum velocity is set to 0.20 meters per second.
A command that incorporates these parameters might look like this:
python train.py --task simulation --target_altitude 1.0 --action_range 0.20
Other parameters can also be adjusted in a similar manner using command line arguments. Alternatively, they can be manually edited in the arguments.py file.
The train.py file is the main executable responsible for parsing the arguments defined in arguments.py. It also initializes both the environment and RL algorithm using the parameters specified in the file.
The arguments.py file defines a set of arguments intended to be parsed from the command line. These arguments are utilized by the train.py file to configure the RL algorithm and the Crazyflie environment. Detailed instructions for using the command line can be found in the Running the RL Environment section.
The networks.py file contains the neural network architectures for the policy and value function used in the PPO algorithm. Parameters for these neural networks can be adjusted either in the arguments.py file or directly via the command line.
The ppo_agent.py file implements the PPO algorithm and manages the training process.
The crazyflie_env.py file defines the environment used for training with the PPO algorithm. It includes the reset(), step() , and render() functions, supporting both simulation and real environments. This file specifies the observation space, action space, and reward function, all of which can be customized based on the task.
If you have any questions, feel free to email [email protected]


