Skip to content

Add token export mask visualizer#2606

Open
samsja wants to merge 6 commits into
mainfrom
codex/token-export-mask-visualizer
Open

Add token export mask visualizer#2606
samsja wants to merge 6 commits into
mainfrom
codex/token-export-mask-visualizer

Conversation

@samsja
Copy link
Copy Markdown
Member

@samsja samsja commented May 23, 2026

Summary

image image
  • Add scripts/token_export_mask_visualizer.py for rendering static HTML from token export JSONL files.
  • Compare DPPO prob_delta masking against IcePop importance-ratio masking.
  • Include token-level KL gradient, DPPO/IcePop overlay toggles, live threshold tuning, quantitative mask/disagreement counts, and top-token examples.
  • Support file, step-directory, and token_exports-root inputs with optional rank/env filters and optional tokenizer decoding.
  • Add docs/token-export-mask-visualizer.md with quick-start commands, common options, mask-rule notes, UI color meanings, and an agent workflow.

Validation

  • python -m py_compile scripts/token_export_mask_visualizer.py
  • uvx ruff check --config=pyproject.toml
  • uvx ruff format --check --config=pyproject.toml
  • Smoke render on /beegfs/outputs/glm5.1-rlm-prod/token_exports --step 8 --rank 0 --top-records 1 --window-tokens 20 --no-decode

Note

Low Risk
Low risk: changes are additive (new debugging/visualization script + docs) and only tweak formatting config, with no impact on training/runtime code paths.

Overview
Adds scripts/token_export_mask_visualizer.py, a CLI tool that scans token export JSONL (file/step_N dir/token_exports root), computes DPPO vs IcePop mask decisions, and generates a self-contained HTML page with KL heatmap, mask overlays, record filtering, live threshold tuning, and top-token example tables.

Documents the tool in docs/token-export-mask-visualizer.md and links it from the docs index/navigation. Updates pyproject.toml to exclude the new script from ruff format.

Reviewed by Cursor Bugbot for commit 038cdaf. Bugbot is set up for automated code reviews on this repo. Configure here.

@samsja samsja marked this pull request as ready for review May 23, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant