A Visual Testing Harness for AI Coding Agents in Robot Simulation
Let Claude Code and Codex see what the robot is doing, judge if it's working, and iterate autonomously.
Front View![]() Plan → Pregrasp → Approach → Close → Lift → Holding |
Top-Down View![]() Top-down view: object alignment and grasp closure |
Auto-generated from CI on every push to main — MuJoCo grasp, G1 WBC reach, G1 locomotion, native LeRobot G1 (GR00T + SONIC planner), standalone SONIC planner, and SONIC tracking.
| Demo | Description | Report | Run |
|---|---|---|---|
| MuJoCo Grasp | Scripted grasp with Meshcat 3D, multi-view captures | Live | python examples/mujoco_grasp.py --report |
| G1 WBC Reach | Whole-body IK reaching (Pinocchio + Pink) | Live | python examples/g1_wbc_reach.py --report |
| G1 Locomotion | GR00T RL stand→walk→stop, HuggingFace model | Live | python examples/lerobot_g1.py --report |
| G1 Native LeRobot (GR00T) | Official make_env() factory + GR00T Balance + Walk |
Live | python examples/lerobot_g1_native.py --controller groot --report |
| G1 Native LeRobot (SONIC) | Official make_env() factory + SONIC planner |
Live | python examples/lerobot_g1_native.py --controller sonic --report |
| SONIC Planner | Standalone GEAR-SONIC planner demo on G1 | Live | python examples/sonic_locomotion.py --report |
| SONIC Motion Tracking | Real encoder+decoder tracking demo on G1 | Live | python examples/sonic_tracking.py --report |
pip install roboharness # core (numpy only)
pip install roboharness[demo] # demo dependencies (MuJoCo, Meshcat, Gymnasium, Rerun, etc.)
pip install roboharness[demo,wbc] # + whole-body control (Pinocchio, Pink)
pip install roboharness[dev] # development + SONIC real-model test depspip install roboharness[demo]
python examples/mujoco_grasp.py --report| pre_grasp | contact | grasp | lift |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
| Gripper above cube | Lowered onto cube | Fingers closed | Cube lifted |
G1 Humanoid WBC Reach
pip install roboharness[demo,wbc]
python examples/g1_wbc_reach.py --reportWhole-body control (WBC) for the Unitree G1 humanoid using Pinocchio + Pink differential-IK for upper-body reaching while maintaining lower-body balance. The controller solves inverse kinematics for both arms simultaneously, letting the robot reach arbitrary 3D targets without falling over.
| stand | reach_left | reach_both | retract |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
LeRobot G1 Locomotion
pip install roboharness[demo]
python examples/lerobot_g1.py --reportIntegrates the real Unitree G1 43-DOF model from HuggingFace with GR00T WBC locomotion policies (Balance + Walk). The example downloads the model and ONNX policies automatically, runs the G1 through stand → walk → stop phases, and captures multi-camera checkpoints via RobotHarnessWrapper.
Native LeRobot Integration
pip install torch --index-url https://download.pytorch.org/whl/cpu # CPU-only
pip install roboharness[demo] lerobot
MUJOCO_GL=osmesa python examples/lerobot_g1_native.py --controller groot --report
MUJOCO_GL=osmesa python examples/lerobot_g1_native.py --controller sonic --reportUses LeRobot's official make_env("lerobot/unitree-g1-mujoco") factory for standardized env creation. The published native demo reports are split by controller: one report for GR00T and one for SONIC. DDS-ready for sim-to-real transfer when hardware is available. See #83 for details.
SONIC Planner
pip install roboharness[demo]
MUJOCO_GL=osmesa python examples/sonic_locomotion.py --report --assert-successStandalone NVIDIA GEAR-SONIC planner demo on the real Unitree G1 MuJoCo model. This path uses planner_sonic.onnx only: velocity commands go in, full-body pose trajectories come out, and the example uses a lightweight virtual torso harness for stable visual debugging. This is the same standalone planner path published at /sonic-planner/.
from roboharness.robots.unitree_g1 import SonicLocomotionController, SonicMode
ctrl = SonicLocomotionController()
action = ctrl.compute(
command={"velocity": [0.3, 0.0, 0.0], "mode": SonicMode.WALK},
state={"qpos": qpos, "qvel": qvel},
)For a planner demo wired through LeRobot's official make_env() stack, see G1 Native LeRobot (SONIC) above. The planner path and the encoder+decoder tracking path are different inference stacks with different ONNX contracts; see docs/sonic-inference-stacks.md for the exact split, validation policy, and joint-order conventions.
SONIC Motion Tracking
pip install roboharness[demo]
MUJOCO_GL=osmesa python examples/sonic_tracking.py --report --assert-successReal encoder+decoder tracking demo on the Unitree G1. This path uses model_encoder.onnx + model_decoder.onnx directly, replays a motion clip via set_tracking_clip(...), and records checkpoint metrics for torso height, tracking-frame progress, and joint-tracking error. This is the same path published at /sonic/.
from roboharness.robots.unitree_g1 import MotionClipLoader, SonicLocomotionController
ctrl = SonicLocomotionController()
clip = MotionClipLoader.load("path/to/dance_clip/")
ctrl.set_tracking_clip(clip)
action = ctrl.compute(
command={"tracking": True},
state={"qpos": qpos, "qvel": qvel},
)Models (planner_sonic.onnx, model_encoder.onnx, model_decoder.onnx) are downloaded from HuggingFace (nvidia/GEAR-SONIC) on first use. Requires pip install roboharness[demo]. See docs/sonic-inference-stacks.md for the exact split between planner and tracking, plus the validation policy and joint-order conventions. See #86 (Phase 1) and #92 (Phase 2).
import gymnasium as gym
from roboharness.wrappers import RobotHarnessWrapper
env = gym.make("CartPole-v1", render_mode="rgb_array")
env = RobotHarnessWrapper(env,
checkpoints=[{"name": "early", "step": 10}, {"name": "mid", "step": 50}],
output_dir="./harness_output",
)
obs, info = env.reset()
for _ in range(200):
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
if "checkpoint" in info:
print(f"Checkpoint '{info['checkpoint']['name']}' captured!")from roboharness import Harness
from roboharness.backends.mujoco_meshcat import MuJoCoMeshcatBackend
backend = MuJoCoMeshcatBackend(model_path="robot.xml", cameras=["front", "side"])
harness = Harness(backend, output_dir="./output", task_name="pick_and_place")
harness.add_checkpoint("pre_grasp", cameras=["front", "side"])
harness.add_checkpoint("lift", cameras=["front", "side"])
harness.reset()
result = harness.run_to_next_checkpoint(actions)
# result.views → multi-view screenshots, result.state → joint angles, poses| Simulator | Status | Integration |
|---|---|---|
| MuJoCo + Meshcat | ✅ Implemented | Native backend adapter |
| LeRobot (G1 MuJoCo) | ✅ Implemented | Gymnasium Wrapper + Controllers |
LeRobot Native (make_env) |
✅ Implemented | make_env() + VectorEnvAdapter |
| Isaac Lab | ✅ Implemented | Gymnasium Wrapper (GPU required for E2E) |
| ManiSkill | ✅ Implemented | Gymnasium Wrapper |
| LocoMuJoCo / MuJoCo Playground / unitree_rl_gym | 📋 Roadmap | Various |
- Harness only does "pause → capture → resume" — agent logic stays in your code
- Gymnasium Wrapper for zero-change integration — works with Isaac Lab, ManiSkill, etc.
- SimulatorBackend protocol — implement a few methods, plug in any simulator
- Agent-consumable output — PNG + JSON files that any coding agent can read
See docs/context.en.md for full background and motivation.
Roboharness builds on ideas from several research efforts in AI-driven robot evaluation and code-as-policy:
- FAEA — LLM agents as embodied manipulation controllers without demonstrations or fine-tuning (Tsui et al., 2026)
- CaP-X — Benchmark framework for coding agents that program robot manipulation tasks (Fu et al., 2026)
- StepEval — VLM-based subgoal evaluation for scoring intermediate robot manipulation steps (ElMallah et al., 2025)
- SOLE-R1 — Video-language reasoning as the sole reward signal for on-robot RL (Schroeder et al., 2026)
- AOR — Multimodal coding agents that iteratively rewrite control code from visual observations (Kumar, 2026)
If you use Roboharness in academic work, please cite it using the metadata in
CITATION.cff or the "Cite this repository" button on GitHub.
Contributions welcome — including from AI coding agents! See CONTRIBUTING.md.
MIT









