Research code for testing multiple learned "brains" on a MuJoCo quadruped and related embodied-control assets. The repo combines a JAX evolution-strategy trainer, a MuJoCo rollout backend, CPG and symmetry controller priors, and a small benchmark for comparing research-inspired policy architectures.
The current focus is not a single final walking policy. It is a clean experimental scaffold for asking: which inductive biases make quadruped control easier to learn under short, noisy, reproducible training budgets?
- MuJoCo quadruped simulation with typed runtime configs and quality gates.
- OpenAI-ES style JAX trainer with pluggable policy architectures.
- Three retained research-brain variants:
research_contact_memory_cpg: contact-memory CPG modulation.research_symmetry_cpg: mirror-projected CPG modulation with left/right equivariance.research_limb_attention: shared limb-token attention for direct motor targets.
- Invented benchmark tasks for forward reach, offset recovery, and contact-rhythm stability.
- Native MuJoCo viewers for quadruped, battlebot, and Tesselate assets.
- Unit tests covering configs, control projection, MuJoCo backend behavior, model registry loading, research brain plugins, and smoke quality gates.
These numbers come from the short benchmark in reports/research_brain_benchmark.json. They are fast regression and comparison signals, not a claim of converged locomotion performance.
| Rank | Model | Mean task score | Control mode | Research idea |
|---|---|---|---|---|
| 1 | research_contact_memory_cpg |
5.602 | cpg_params |
CPG-RL plus contact-state memory |
| 2 | research_symmetry_cpg |
5.023 | cpg_params |
Morphological symmetry projection |
| 3 | research_limb_attention |
4.846 | motor_targets |
Morphology-aware limb-token attention |
Run the benchmark:
python3 tools/benchmark_research_quadruped_brains.py --generations 1python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
python3 -m unittest discover -s tests -vRun the fastest quality gate:
python3 train_headless.py --config configs/smoke.yaml --quality-onlyTrain a retained research brain for a short local run:
python3 train_headless.py --config configs/research_contact_cpg.yaml --generations 10Convenience targets are available through make:
make test
make quality
make benchmarkbrains/
config/ Typed runtime config loading and validation.
controllers/ CPG and symmetry controller building blocks.
harnesses/ Scripted and robot-specific experiment harnesses.
models/ Model registry, built-in policies, and research brain plugins.
research/ Research benchmark tasks and scoring.
runtime/ Checkpoints, manifests, logging, and quality gates.
sim/ MuJoCo backend, action projection, and asset loading.
configs/ Reproducible runtime specs for smoke, default, and research runs.
docs/ Architecture, experiments, reproducibility, and asset notes.
reports/ Checked-in benchmark summaries.
assets/mujoco/ MJCF scenes and generated STL assets.
notebooks/ Exploratory notebooks; not the CI source of truth.
tests/ Unittest suite used as the main regression gate.
tools/ Training, benchmark, viewer, and CAD-generation entry points.
The repo currently keeps only the top three research-brain candidates in configs/model_registry.json. Their implementations live in brains/models/research_quadruped_brains.py, and the benchmark lives in brains/research/quadruped_brain_benchmark.py.
The design draws from:
- CPG-RL and SYNLOCO style central pattern generator modulation.
- MS-PPO style morphological symmetry equivariance.
- UniLegs and related morphology-aware attention policies.
- Curriculum-style multi-task evaluation, adapted here into a tiny local benchmark.
See docs/EXPERIMENTS.md for the retained task set and scoring logic, and papers.md for the broader reading map.
Primary local checks:
python3 -m unittest discover -s tests -v
python3 train_headless.py --config configs/smoke.yaml --quality-only
python3 tools/benchmark_research_quadruped_brains.py --generations 1Docker check:
docker build -t multi-brain-quadruped:test .
docker run --rm multi-brain-quadruped:testLonger training writes checkpoints under:
checkpoints/<model_type>_<log_id>/
See docs/REPRODUCIBILITY.md for exact commands, expected outputs, and artifact locations.
On macOS, the MuJoCo passive viewer must run on the main thread, so use mjpython from the mujoco package. On Linux, regular python3 is usually sufficient.
.venv/bin/mjpython main.py --config configs/smoke.yaml
.venv/bin/mjpython tools/view_scripted_rollout.py --robot quadruped
.venv/bin/mjpython tools/view_battlebot.py
.venv/bin/mjpython tools/view_tesselate.pyOpen MJCF scenes directly:
python3 -m mujoco.viewer --mjcf assets/mujoco/scene.xml
python3 -m mujoco.viewer --mjcf assets/mujoco/battlebot_scene.xml
python3 -m mujoco.viewer --mjcf assets/mujoco/tesselate_scene.xmlRuntime simulation only needs the checked-in MJCF/STL assets. STEP conversion is optional and requires CAD dependencies.
# Battlebot STEP to STL/MJCF
python3 tools/generate_mujoco_from_step.py
# Tesselate STEP to STL/MJCF
python3 tools/generate_tesselate_mujoco_from_step.pyFor CAD conversion environments:
conda install -c conda-forge pythonocc-core
# or, for the Tesselate assembly converter on systems with wheels:
python3 -m pip install cadquerySee docs/ASSETS.md for what is generated and what is hand-authored.
- The quadruped is the main training target used by tests and quality gates.
- The battlebot and Tesselate assets are included as additional embodied-control scenes and CAD-to-MJCF examples.
- Stunt commands such as front flip, back flip, and side roll are experimental placeholders. They are not presented as solved behaviors or quality gates.
- The benchmark is intentionally short so it can run on a laptop and in CI-like loops. For publication-quality claims, increase generations, seeds, and evaluation episodes.
If this repo helps your work, cite it with CITATION.cff.