Skip to content

feat(console): operator UI mounted on the gateway; retire visualizer.py#558

Open
jeffreysijuntan wants to merge 1 commit into
pr-3-harness-harborfrom
pr-4-console
Open

feat(console): operator UI mounted on the gateway; retire visualizer.py#558
jeffreysijuntan wants to merge 1 commit into
pr-3-harness-harborfrom
pr-4-console

Conversation

@jeffreysijuntan

Copy link
Copy Markdown
Contributor

Summary

Stacked on #557. The largest PR in the stack — replaces the standalone rllm/eval/visualizer.py (~1k LOC) with an SPA + panel-registry backend mounted onto the model gateway. rllm view boots a view-only gateway (TraceStore + console; no proxy workers) and serves the console at /console.

What changes

Frontend (rllm-console/) — bun + Vite 6 + React 19 + RR7 + Tailwind v4 + TanStack Query 5

  • Sessions panel: cross-run trace feed (gateway TraceStore) with filter bar (run / model / harness / has_error / latency) and chat-message rendering. The schema-v2 denormalised columns from feat(model-gateway): trace store schema v2 #551 are what makes this filter-heavy view fast.
  • Runs panel: per-eval-run grid → episode viewer → live Step view, reading from ~/.rllm/eval_results/<run>/ and the gateway's shared db.
  • Datasets panel: registry browser, entry inspector, pull-from-UI with PTY-streamed progress.
  • Settings panel: config viewer + env-var manager (writes to ~/.rllm/.env; presets in known.py).
  • Sandboxes / Eval Launcher / Training panels: placeholders for follow-up iterations.

Backend (rllm/console/)

  • mount_console(app, eval_results_root, url_prefix=\"/console\") attaches the panel registry to any FastAPI app — typically rllm-model-gateway, but works on any host.
  • Panels are Panel(id, title, icon, nav_order, router, placeholder) dataclasses registered via register_panel. Each panel ships its own api.py (FastAPI APIRouter) and a frontend panels/<id>/ module mounted by id.

CLI

  • rllm/cli/view.py rewritten to launch the gateway+console pair.
  • rllm/eval/trace_loader.py (new) loads run/episode/step data for the Runs panel.
  • rllm/eval/visualizer.py (deleted, ~1k LOC).

Packaging

  • pyproject.toml: console/static/** added to package-data so the built SPA ships in the wheel; visualizer.py ruff exclusion dropped.
  • .gitignore: console/static/ (built SPA) + rllm-console/ build artefacts.

Stack

  • #551#552#553 — gateway
  • #556 — eval pipeline flip
  • #557 — harnesses + Harbor 0.5
  • [PR-4] this — console operator UI

Test plan

  • python -m pytest tests/console/ tests/eval/test_trace_loader.py — 77 passed
  • Full suite across all stacked PRs: python -m pytest tests/eval/ tests/sandbox/ tests/cli/test_eval_command.py tests/harnesses/ tests/console/ — 577 passed
  • rllm view boots cleanly; SPA shell + panel API endpoints reachable

Follow-ups (separate PRs)

  • CI workflow to cd rllm-console && bun install && bun run build before packaging the wheel — until that lands, pip install -e users have an empty console/static/ until rllm view auto-builds.
  • Eval Launcher panel (placeholder today) — kicks off rllm eval runs from the UI.
  • Sandboxes panel (placeholder today) — live sandbox orchestration view.
  • Training panel (placeholder today) — surfaces training-run state from the gateway's worker-pool side.

🤖 Generated with Claude Code

Replaces the standalone Streamlit-style ``rllm/eval/visualizer.py``
(~1k LOC) with a proper SPA + panel-registry backend mounted onto
the model gateway. ``rllm view`` boots a view-only gateway
(TraceStore + console; no proxy workers) and serves the console at
``/console``.

Frontend (``rllm-console/``):
- bun + Vite 6 + React 19 + RR7 + Tailwind v4 + TanStack Query 5.
- Sessions panel: cross-run trace feed (gateway TraceStore) with
  filter bar (run / model / harness / has_error / latency) and
  chat-message rendering.
- Runs panel: per-eval-run grid → episode viewer → live Step view
  reading from ``~/.rllm/eval_results/<run>/`` and the gateway's
  shared db.
- Datasets panel: registry browser, entry inspector, pull-from-UI
  with PTY-streamed progress.
- Settings panel: config viewer + env-var manager (writes to
  ``~/.rllm/.env`` for the env-var store; presets in ``known.py``).
- Sandboxes / Eval Launcher / Training panels: placeholders for the
  next iteration.

Backend (``rllm/console/``):
- ``mount_console(app, eval_results_root, url_prefix="/console")``
  attaches the panel registry to any FastAPI app — typically
  rllm-model-gateway, but works on any host.
- Panels are ``Panel(id, title, icon, nav_order, router, placeholder)``
  dataclasses registered via ``register_panel``. Each panel ships its
  own ``api.py`` (FastAPI APIRouter) and a frontend ``panels/<id>/``
  module mounted by id.

CLI:
- ``rllm/cli/view.py`` rewritten to launch the gateway+console pair.
- ``rllm/eval/trace_loader.py`` (new) loads run/episode/step data for
  the Runs panel; ``rllm/eval/visualizer.py`` (deleted, ~1k LOC).

Packaging:
- ``pyproject.toml``: ``console/static/**`` package-data; visualizer.py
  ruff exclusion dropped.
- ``.gitignore``: ``console/static/`` (built SPA) + ``rllm-console/``
  build artefacts (node_modules/, build/).

Tests: 77 new — ``test_mount`` (panel registry + shell-info
contract), ``test_panels_{datasets,runs,sessions,settings}``,
``test_trace_loader``. Full suite across all stacked PRs: 577 passing.

Stacked on PR #557.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant