Geometric Shepherding RL

This project studies the shepherding problem: how a single dog can guide a flock of sheep to a goal in a continuous 2D environment with obstacles and limited visibility.

The main goal of the project is to compare three approaches under the same environment:

a geometric heuristic baseline
behavioral cloning from expert demonstrations
reinforcement learning with PPO / recurrent PPO

Project Idea

The dog is the only controlled agent.
The sheep follow rule-based flocking and escape behavior.
The challenge is to move the flock to the goal while keeping it visible, compact, and away from obstacles.

This repository is built around a shared research environment so the different methods can be compared fairly on:

success rate
episode return
distance to goal
efficiency
generalization to unseen scenarios

What Is Implemented

Heuristic Baseline

A hand-designed collect-and-drive controller based on flock geometry.

Behavioral Cloning

A supervised learning pipeline that trains a random forest to imitate the heuristic expert from engineered geometric features.

Reinforcement Learning

Feedforward and recurrent PPO agents trained in the same shepherding environment.
The recurrent policy is especially useful because the environment is partially observable.

Environment

The main environment is src/shepherding/envs/herding_env_v3.py.

It includes:

partial observability
obstacle-aware movement
reward shaping based on flock geometry
domain-randomized and structured training setups
deterministic unseen evaluation scenarios

The reward combines signals such as:

progress toward the goal
flock compactness
visibility maintenance
collision penalties
driving the flock from a useful position

Repository Structure

src/shepherding/
├── baselines/      # Heuristic controller
├── envs/           # Environment implementations
├── imitation/      # Behavioral cloning pipeline
├── research/       # RL model building, evaluation, benchmarking
├── scenarios/      # Scenario definitions
└── utils/          # Geometry and helper functions

scripts/            # Training, evaluation, plotting, rendering
configs/            # Experiment configs
tests/              # Unit and regression tests
streamlit_app.py    # Presentation dashboard

Installation

git clone https://github.com/<your-username>/geometric-shepherding-rl.git
cd geometric-shepherding-rl
uv sync

Main Workflows

Train RL

uv run python scripts/train_v3_recurrent.py --seed 0

Train Behavioral Cloning

uv run python scripts/generate_bc_dataset.py
uv run python scripts/train_bc.py

Run Benchmark Comparison

uv run python scripts/benchmark_v3.py \
  --config configs/research/v3_fast.yaml \
  --benchmark-config configs/research/benchmark_v3_fast_ds.yaml \
  --output-dir results/research_v3_fast/ds_benchmark

Generate Comparison Figures

uv run python scripts/analyze_ds_results.py \
  --results-dir results/research_v3_fast/ds_benchmark \
  --bc-metrics models/imitation_fast/random_forest/metrics.json

Launch Presentation Dashboard

uv run streamlit run streamlit_app.py

Results

The project supports both:

offline evaluation for behavioral cloning, such as regression error and angle error
online evaluation in the environment, such as success rate and goal proximity

This makes it possible to compare not only how well a model imitates expert actions, but also how well it actually controls the flock when rolled out in the environment.

Where To Start Reading

If you want the quickest understanding of the project, start with:

Testing

uv run python -m unittest discover -s tests -t .

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.streamlit		.streamlit
configs		configs
images		images
models		models
scripts		scripts
src/shepherding		src/shepherding
tests		tests
.codex		.codex
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
streamlit_app.py		streamlit_app.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Geometric Shepherding RL

Project Idea

What Is Implemented

Heuristic Baseline

Behavioral Cloning

Reinforcement Learning

Environment

Repository Structure

Installation

Main Workflows

Train RL

Train Behavioral Cloning

Run Benchmark Comparison

Generate Comparison Figures

Launch Presentation Dashboard

Results

Where To Start Reading

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Geometric Shepherding RL

Project Idea

What Is Implemented

Heuristic Baseline

Behavioral Cloning

Reinforcement Learning

Environment

Repository Structure

Installation

Main Workflows

Train RL

Train Behavioral Cloning

Run Benchmark Comparison

Generate Comparison Figures

Launch Presentation Dashboard

Results

Where To Start Reading

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages