Skip to content

Support delta-only routed experts replay#2632

Merged
S1ro1 merged 4 commits into
mainfrom
r3-delta
May 28, 2026
Merged

Support delta-only routed experts replay#2632
S1ro1 merged 4 commits into
mainfrom
r3-delta

Conversation

@S1ro1

@S1ro1 S1ro1 commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Summary:

  • Preserve dense routed experts while transferring only delta routed-experts sidecars with explicit start offsets.
  • Reconstruct routed prefix chunks for branched/compacted trajectories, then finalize trainer samples with dense routed experts.
  • Match the longest active prefix in interleave_rollout so regenerated/compacted branches do not get folded into shorter active samples.
  • Pin deps/verifiers to merged main with routed_experts_prompt_start support and vllm-router to the released v0.1.26 wheel.

Validation:

  • uv run pytest tests/unit/orchestrator/test_trajectories.py tests/unit/inference/test_serving_tokens.py -q
  • uv run ruff check src/prime_rl/inference/vllm/routed_experts.py src/prime_rl/inference/vllm/serving_tokens.py src/prime_rl/orchestrator/trajectories.py tests/unit/inference/test_serving_tokens.py tests/unit/orchestrator/test_trajectories.py

Note

Medium Risk
Changes affect MoE training signal assembly and multi-step rollout merging; incorrect routing alignment would skew RL updates, though behavior is heavily asserted in unit tests.

Overview
Adds delta-only routed experts with an explicit start offset on compact sidecars: inference serializes/captures using routed_experts_prompt_start, and the orchestrator rebuilds dense per-token routing for training by stitching prefix chunks, enforcing delta alignment on extensions, and padding to full sequence length.

interleave_rollout now picks the longest matching active prefix when several samples overlap (e.g. compaction/rollback), with a warning on ambiguous matches, so regenerated branches are not merged into shorter stale samples.

Dependency pin: vllm-router 0.1.25 → 0.1.26 (lockfile updated).

Reviewed by Cursor Bugbot for commit cda9b16. Bugbot is set up for automated code reviews on this repo. Configure here.

@S1ro1 S1ro1 marked this pull request as ready for review May 27, 2026 23:50
@S1ro1 S1ro1 merged commit ae748fd into main May 28, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants