Skip to content

Releases: Metabuilder-Labs/tokenjam

v0.3.3 — TokenMaxx report polish + Opus 4.5 pricing

09 Jun 18:34
d7145ee

Choose a tag to compare

A launch-readiness polish release for the tj tokenmaxx social moment, plus one more pricing-accuracy fix from a community contributor.

TokenMaxx Report — visual + structural polish

The tj tokenmaxx output is now a bordered report panel designed to be a clean screenshot artifact:

╭─ TokenJam TokenMaxxing Report ──────────────────────────────────────────────╮
│                                                                              │
│  🔥🔥 You're a TokenGigaChad.                                                │
│                                                                              │
│  Touch grass. Then run tj optimize.                                          │
│                                                                              │
│  \$4056.82 in last 30d across 33 sessions.                                    │
│  That's 40.6× your Max 5x plan cost (\$100/mo flat).                          │
│                                                                              │
│  💡 No obvious savings flagged yet — run tj optimize for the full report     │
│  once you have more data.                                                    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
  Share your tier: screenshot the above and tag @tokenjamdev
  • Spend now renders at 2 decimals (\$4056.82, was \$4056.8200)
  • Plan fee strips decimals when whole-dollar (\$100, was \$100.0000)
  • `tj optimize` rendered bold-green wherever it appears
  • Share line in teal, points at `@tokenjamdev`

TokenMaxx — plan-relative tier ladder

Tiers are now based on the multiplier vs your plan cost, so the tier name means the same thing across Pro / Max-5x / Max-20x users:

Multiplier (subscription) Absolute /mo (API) Tier
< 1× < $100 💧 TokenSipper
1× – 4× $100 – $400 🥱 TokenModerator
4× – 10× $400 – $1000 💸 TokenMaxxer
10× – 20× $1000 – $2000 🔥 TokenChad
20×+ $2000+ 🔥🔥 TokenGigaChad

API users (no plan to multiply against) fall back to absolute USD thresholds, calibrated against Max-5x = $100/mo so the tier name carries the same meaning in either world. A Pro user at 15× their plan and a Max-5x user at 15× their plan are both TokenChads — the tier reflects "how hard you're maxxing," not raw spend.

Pricing fix: Claude Opus 4.5

tokenjam/pricing/models.toml had Opus 4.5 at the old \$15 / \$75 tier. Anthropic moved 4.5 to \$5 / \$25 (same tier as 4.6 / 4.7 / 4.8). Users on 4.5 were seeing ~3× inflated cost figures; fixed in this release.

The repo-root pricing/models.toml (orphaned since v0.1.x) was also removed — the runtime only reads tokenjam/pricing/models.toml, and the duplicate file was confusing contributors. CLAUDE.md, CONTRIBUTING.md, and the fallback warning string all now point at the real path.

Thanks to @kelter-antunes for the catch and the dedupe.

Install

```
pip install tokenjam==0.3.3
```

TypeScript SDK in lockstep as `@tokenjam/sdk@0.3.3`.

v0.3.2 — TokenMaxx + Opus pricing accuracy

09 Jun 02:49
ee0e0a1

Choose a tag to compare

A user-facing feature, a cost-accuracy fix from a community contributor, and an upgrade-safe pricing escape hatch.

New: tj tokenmaxx

A shareable spend-tier command. Reads your last 30 days of spend, classifies it into an ironic tier (TokenSipper / TokenModerator / TokenMaxxer / TokenChad / TokenGigaChad), and surfaces the downsize savings figure inline so the score is always paired with an action.

🔥🔥 You're a TokenGigaChad.
   "Touch grass. Then run `tj optimize`."

$3502.72 in last 30d across 33 sessions.
That's 35.0× your Max 5x plan cost ($100.00/mo flat).

💡 $340/mo of that looks recoverable. Run `tj optimize` to see candidates.

The tier ladder, monthly USD spend:

Spend Tier One-liner
< $50 💧 TokenSipper "Are you even using AI?"
$50–$200 🥱 TokenModerator "Mostly reasonable. Try harder."
$200–$500 💸 TokenMaxxer "You're paying Anthropic's rent."
$500–$1500 🔥 TokenChad "You're paying their interns' rent too."
$1500+ 🔥🔥 TokenGigaChad "Touch grass. Then run tj optimize."

Plan-aware: when [budget.<provider>] plan = "max_5x" (or pro / plus / max_20x) is declared, the output renders the multiplier vs the plan's flat fee — the figure that actually travels socially. API users see absolute spend; team / enterprise see plan label only (contract-priced fees).

--json flag for machine-readable output.

Pricing fix: Claude Opus 4.5 / 4.6 / 4.7 / 4.8

The packaged pricing table had Claude Opus models at the old $15 / $75 per MTok tier. Anthropic dropped Opus 4.5 onward to $5 / $25 per MTok (with $0.50 cache read, $6.25 5-minute cache write). The packaged rates now match Anthropic's published pricing exactly. Users running Opus 4.5–4.8 were seeing ~3× inflated cost figures; this fixes that.

Verified against platform.claude.com/docs/en/docs/about-claude/pricing. New regression tests guard against future drift.

Thanks to @kelter-antunes for catching and fixing this.

New: User pricing override file

You can now override packaged rates without editing site-packages (which pip install --upgrade clobbers). Resolution order:

  1. TJ_PRICING_FILE env var (if set)
  2. ~/.config/tj/pricing.toml (if it exists)

Override entries are merged per provider/model over the packaged table — same TOML schema. Missing or malformed override files log a warning and fall back to packaged rates; never breaks cost calculation.

# ~/.config/tj/pricing.toml
[anthropic.some-future-model]
input_per_mtok = 4.00
output_per_mtok = 20.00
cache_read_per_mtok = 0.40
cache_write_per_mtok = 5.00

Thanks to @kelter-antunes for this too.

Install

pip install tokenjam==0.3.2

TypeScript SDK in lockstep as @tokenjam/sdk@0.3.2.

v0.3.1 — Optimize CLI ergonomics

29 May 23:51
af4a46f

Choose a tag to compare

Fast-follow on v0.3.0 to make the optimize CLI match how users think and read about the products.

CLI changes

Positional analyzer args (was: --finding NAME)

tj optimize                       # run all (unchanged)
tj optimize downsize              # run one
tj optimize downsize cache trim   # run several

The old --finding NAME flag is removed. There are no aliases.

Analyzer names match the product names

Old New
model-downgrade downsize
cache-efficacy cache
workflow-restructure script
prompt-bloat trim

cache-recommend (sub-finding of Cache) and budget-projection (infra concept) keep their names.

tj report --bloattj report --trim

Same motivation — "bloat" wasn't used anywhere outside the CLI.

What's unchanged

Honesty discipline: MODEL_DOWNGRADE_CAVEAT and all "structural match — review before applying" framing stay in place. This is a CLI rename only — no behavior or rendering changes.

Install

pip install tokenjam==0.3.1

TypeScript SDK published in lockstep as @tokenjam/sdk@0.3.1.

Closes #74.

v0.3.0 — Layer-9 cost optimization product

29 May 15:56
46439a7

Choose a tag to compare

TokenJam pivots from "observability for AI agents" to a focused cost-optimization product. The OTel-native ingest pipeline and local-first architecture stay; four named analyzers ship on top.

New: cost-optimization analyzers

  • Downsize (tj optimize --finding model-downgrade) — structural candidate detection for cheaper-model routing. Honesty-disciplined: every flagged session is labeled "structural match — review before switching," never "safe to downgrade."
  • Cache (cache-efficacy + cache-recommend) — current caching ratio per (provider, model) and Anthropic-only breakpoint suggestions for stable prefixes.
  • Script (workflow-restructure) — clusters of deterministic (tool_name, arg_shape) sessions that look replaceable with a script.
  • Trim (prompt-bloat) — LLMLingua-2 token-significance classifier behind the optional tokenjam[bloat] extra (~2GB torch + transformers).

All analyzers self-register via @register("name") in tokenjam/core/optimize/analyzers/. Run all with tj optimize; scope to one with --finding <name> (repeatable).

New: backfill adapters

tj backfill langfuse|helicone|otlp — ingest from external observability platforms via live API or JSON dump. Idempotent re-runs via deterministic span IDs. Joins existing tj backfill claude-code.

New: HTTP API for analyzers

/api/v1/optimize + /api/v1/cost/compare so tj optimize works alongside a running tj serve (previously crashed on the DuckDB write-lock).

New: read-only policy preview

tj policy list consolidates [alerts], [capture], [budget.<provider>], per-agent budget / drift / sensitive_actions / output_schema config into one table. The unified add | edit | apply surface lands next sprint.

New: honest-output rendering

Plan-tier-aware optimize output:

  • API users — dollar-denominated savings projections (unchanged)
  • Subscription users — implied API value + token-share framing; never dollar "spend"
  • Local users — token-only framing for capacity planning
  • Unknown-plan users — dollar figures suppressed with a tj onboard --reconfigure hint

tj optimize --export-config claude-code writes a JSONC routing snippet (with the structural-heuristic caveat baked in as comments) to ~/.config/tokenjam/exports/. Never touches ~/.claude/settings.json.

Codex CLI integration

tj onboard --codex writes [otel] + [mcp_servers.tj] to ~/.codex/config.toml. The new /v1/logs endpoint normalizes Codex event logs (sse_event, user_prompt, tool_decision, etc.) into spans for cost / drift / alerting.

Notable fixes

  • SDK: fail-loud ERROR-level logging on 401 span exports with the configured-secret fingerprint, so silent data-loss is impossible. Was previously a single low-volume warning per batch.
  • API: /metrics aggregates by agent_id before emitting Prometheus rows. Previously emitted duplicate label sets that broke strict scrapers.
  • Doctor: tj doctor no longer reports "DuckDB not writable" when the daemon legitimately holds the write lock.
  • Onboard: explicit error on bare --reconfigure (was a silent early-return); secret-divergence warning when project-local and global configs disagree.
  • Drift / budget / backfill: every CLI command now works under both direct-DB and API-shim modes consistently.
  • Optimize: --compare last-7d and last-30d now override --since so the analysis window matches the comparison period (was 30d-vs-30d when --since defaulted to 30d).
  • Export: routing snippet now written as .jsonc with properly-indented comments — parseable by strict JSONC tooling.
  • Policy list: --json accepted as both root flag and command-level flag; [capture] row always shown (even when all toggles off — an explicit "off" is still a policy choice).

Install

pip install tokenjam==0.3.0
# Optional Trim analyzer (large download — pulls torch + transformers):
pip install 'tokenjam[bloat]==0.3.0'

TypeScript SDK published in lockstep as @tokenjam/sdk@0.3.0.

Acknowledgements

This release ran the full v0.3.x manual pre-release playbook before tagging — 14 sections covering analyzers, backfill, period comparison, config export, policy preview, server + HTTP fallback, web UI, and cleanup. The 8 polish findings surfaced during the playbook are all addressed in this release.

v0.2.3 — tj optimize

18 May 00:30
6ea3a82

Choose a tag to compare

Features

tj optimize — cost-saving recommendations from existing data

Two analyzers run over your captured spans:

  • Model-downgrade candidates — flags sessions whose structural shape (short input, short output, few tool calls) matches a class of work where a cheaper model in the same provider family is worth reviewing. Surfaces example traces; never claims quality equivalence (the caveat line is in the dataclass default so it can't be removed by accident).
  • Budget projection — per-provider monthly projection against any [budget.<provider>] ceiling. Scopes spend by provider, shows exhaustion date, projected overage, and what the run rate would drop to if you acted on the downgrade candidates.
tj optimize                                # both analyzers, last 30d
tj optimize --only budget
tj optimize --budget anthropic --budget-usd 50
tj optimize --json

Runs alongside a live tj serve via a read-only DuckDB fallback. Also exposed as the new get_optimize_report MCP tool — your coding agent can ask "where could I save money?" mid-session.

tj backfill claude-code

Reads ~/.claude/projects/*.jsonl and ingests historical sessions into the local DB. Idempotent (deterministic span IDs). Auto-invoked at the end of tj onboard --claude-code so first-time users have history immediately and tj optimize returns real numbers on first run.

[budget.<provider>] config

New TOML section for periodic monthly budgets. Distinct from [defaults.budget] / [agents.X.budget] (per-agent alert thresholds). tj onboard --claude-code writes a sensible default [budget.anthropic] usd = 200.

MCP server — 14 tools (up from 13)

Added get_optimize_report.

Fixes

  • Pricing lookup tolerates dated claude-<family>-<ver>-YYYYMMDD model-name suffixes Anthropic ships (e.g. claude-haiku-4-5-20251001).
  • Pricing table: added claude-opus-4-7 and claude-opus-4-5.

Docs

  • README now leads with the tj optimize UX (verbatim output) and embeds five Web UI screenshots.
  • CLAUDE.md gains a new rule codifying the honesty constraint on optimize output.
  • Manual test runbooks updated with steps for the new commands.

Install / upgrade

pip install --upgrade tokenjam
npm install @tokenjam/sdk@0.2.3

Full changelog: v0.2.2...v0.2.3

v0.2.2

12 May 23:31
432dd8f

Choose a tag to compare

Highlights

New SDK surface

  • tokenjam.sdk.TokenJamClient — public HTTP client that POSTs a single LLM call as an OTLP JSON span to a running tj serve, without depending on the in-process OTel TracerProvider. Designed to be embedded in foreign codebases — most notably the upstream BerriAI/litellm named-callback machinery, which will let any LiteLLM user enable TokenJam with litellm.success_callback = ["tokenjam"]. The single public method, emit_litellm_span(kwargs, response_obj, start_time, end_time, success), translates LiteLLM's callback payload into provider / model / input / output / cache token attributes, attaches a precomputed cost from kwargs["response_cost"] (or response._hidden_params), and tags the span with tj_agent_id / tj_session_id if supplied via kwargs["metadata"]. Non-blocking by design — every error is logged at debug and the event is dropped, so the client can never propagate an exception into the caller's request path. For in-process tokenjam users, patch_litellm() remains the preferred path. (#61)

Upgrading

pip install --upgrade tokenjam==0.2.2. No breaking changes; the TypeScript SDK is unchanged but bumped in lockstep at @tokenjam/sdk@0.2.2.

v0.2.1

12 May 23:20
64e99d0

Choose a tag to compare

TokenJam 0.2.1

Maintenance release tieing up loose ends on the rename to TokenJam (tj / tokenjam), a
DuckDB stats-corruption recovery path in tj doctor, and a handful of deferred fixes
from 0.1.9 / issue #48.

🩹 Fixes

  • DuckDB spans column-statistics corruptiontj doctor now detects and repairs
    the rare DuckDB stats corruption that caused tj cost / tj traces to fail with
    INTERNAL Error: Failed to combine statistics. Run tj doctor --fix to recover
    without losing data. (#56)
  • Codex onboard: purge legacy ocw configtj onboard --codex now strips stale
    [mcp_servers.ocw] and any ocw-managed [otel] block from ~/.codex/config.toml
    left over from pre-rename installs. (#57)
  • FastAPI lifespan migrationtj serve replaces the deprecated
    @app.on_event("startup"/"shutdown") hooks with an asynccontextmanager lifespan. As
    a side benefit, the retention scheduler now only starts after uvicorn binds the port,
    so a failed bind no longer leaves an orphan scheduler running. (#59, Greptile P2 from
    #47)
  • tj doctor drift baseline noise — agents still collecting their baseline (<10
    completed sessions) are no longer reported as a warning. Downgraded to info with a
    "Collecting baseline: agent (N/M)" message; agents at threshold are silent. (#59,
    deferred from #48)
  • mypy: handle fetchone() returning None in doctor stats repair path.

🔧 Refactor

  • Naming migration ocwtj / tokenjam — completes the rename across the
    codebase, including the dashboard. PyPI package is tokenjam, CLI is tj, Python
    module is tokenjam. (#55)
  • Dashboard — monochrome theme with light/dark/system toggle and updated
    tokenjam.dev branding. (#53)
  • README — icon now points at the website-hosted SVG.
  • Manual test docs updated for the rename and new dashboard. (#54)

v0.2.0 — TokenJam

07 May 18:41
4e2f01d

Choose a tag to compare

First release under the TokenJam name.

This release renames the project (formerly OpenClawWatch / Token Juice) and publishes the new packages to PyPI and npm:

What changed

  • Project renamed: `openclawwatch` → `tokenjam`
  • Python package directory: `ocw/` → `tj/` → `tokenjam/` (final, isolated from the existing PyPI `tj` package)
  • CLI entry point: `ocw` → `tj`
  • Config paths: `ocw.toml` → `tj.toml`, `.ocw/` → `.tj/`, `/.config/ocw/` → `/.config/tj/`
  • Daemon identifiers: `com.openclawwatch.serve` → `com.tokenjam.serve`, `openclawwatch.service` → `tokenjam.service`
  • Env vars: `OCW_` → `TJ_`
  • TypeScript SDK: `@openclawwatch/sdk` → `@tokenjam/sdk`
  • All internal classes renamed (`OcwSpanExporter` → `TjSpanExporter`, etc.)
  • All docs, examples, incidents, tests updated

Migration

If you were running an earlier `openclawwatch` build:

  1. `pip uninstall openclawwatch` and `pip install tokenjam`
  2. Move `/.config/ocw/` → `/.config/tj/` (or run `tj onboard` fresh)
  3. Replace `OCW_` env vars with `TJ_`
  4. CLI command is now `tj` instead of `ocw`

v0.1.9

27 Apr 20:06
bb5bbee

Choose a tag to compare

Highlights

New integrations

  • Codex CLI observabilityocw onboard --codex configures the OpenAI Codex CLI to send OTLP logs to OCW. Codex events (sse_event, user_prompt, tool_decision, tool_result, api_request) are converted to spans for cost tracking, drift detection, and alerting. All Codex traces land under the codex_exec agent. (#40, #45)
  • Agent Incident Library — new ocw demo command runs zero-config reproductions of common AI agent incidents: retry-loop, surprise-cost, hallucination-drift. No API keys, no live agents needed — each scenario injects synthetic spans through the production pipeline. (#41)

Bug fixes

  • Alerts and drift detection now actually fire in production. Both ocw serve and the SDK auto-bootstrap were constructing the ingest pipeline with only a cost engine — alert_engine, drift_detector, and schema_validator were silently None. Spans were stored and costed correctly, but no alerts ever fired and no drift baseline was ever built outside the ocw demo path. New build_default_pipeline() factory wires up all four engines uniformly. (#43)
  • ocw onboard --claude-code skip-reinstall now works. _install_launchd was using plain launchctl load after ocw stop's launchctl unload -w, which left the daemon in a Disabled state — the load was a silent no-op while returning success. Subsequent onboards always fell through to "Daemon: installing..." and triggered another macOS "Background Items Added" prompt. Switched to launchctl load -w to clear the Disabled flag. (#45)
  • ocw onboard --codex writes to global config. Was reading server.state and writing to whatever config the running server happened to load (often project-local), silently rotating that secret on every onboard. Now mirrors --claude-code and always writes to ~/.config/ocw/config.toml. (#45)
  • ocw stop actually stops everything. Returned immediately after a successful launchctl unload, leaving any orphan foreground ocw serve & process holding port 7391. Now sweeps for foreground stragglers after the launchd unload, with a bounded loop and per-PID dedup so a slow-shutting process can't cause an infinite signal loop. (#45)
  • Drift web UI shows actual drift status. When the baseline stddev was 0 (the demo's case — every baseline session has identical token counts), the UI short-circuited to a green "pass" badge. Now mirrors the CLI's z_score() semantics: stddev=0 with latest != avg shows inf and a red drift badge. (#46)
  • ocw traces TYPE column populated for every row. Was a non-deterministic correlated subquery in DuckDB that returned NULL for most rows when the outer table was un-aliased — replaced with FIRST(name ORDER BY start_time). (#46)
  • ocw status shows real session data over a running server. ApiBackend.get_completed_sessions was a hardcoded return [] stub; the CLI showed every agent as "idle" with zeros while ocw serve was running. Now calls /api/v1/status. (#46)
  • ocw doctor global-config fallback for projects without a local .ocw/config.toml. (#36)
  • ocw onboard --codex skip-on-rerun message no longer strips [otel] and [mcp_servers.ocw] section names (Rich was interpreting brackets as markup). (#45)
  • server.state no longer clobbered by failed-to-bind serve processes — write moved to a FastAPI startup event that fires only after a successful uvicorn bind. (#47)

Demo UX

  • The examples/alerts_and_drift/ demos (sensitive_actions, budget_breach, drift) now seed their own [agents.<id>] config block on startup, so they fire alerts on a fresh ocw onboard without manual TOML editing. Drift detection also falls back to default AgentConfig for unconfigured agents, so it works for any observed agent without requiring an explicit config block. (#43)

Documentation

  • README restructured around three integration paths (#39)
  • CLAUDE.md expanded with sections on Daemon lifecycle, MCP server fallback, and Codex CLI integration (#42)
  • Manual test playbooks fully refreshed and validated end-to-end against this release

Upgrading

pip install --upgrade openclawwatch for the Python package, npm install @openclawwatch/sdk@0.1.9 for the TypeScript SDK.

If you're upgrading from a release older than 0.1.8 and have a project-local .ocw/config.toml plus a global one, note that ocw onboard --codex and ocw onboard --claude-code now both write to ~/.config/ocw/config.toml exclusively. Project-local configs are no longer rotated by these flows.

v0.1.8

15 Apr 00:36
07ba86b

Choose a tag to compare

Features

  • Agents API endpointGET /api/v1/agents with first_seen, last_seen, lifetime_cost_usd
  • Alert acknowledge endpointPATCH /alerts/{alert_id}/acknowledge REST endpoint
  • Dashboard auto-polling — Traces, Cost, Alerts, Drift, and Budget views now refresh every 10s (Status already polled at 5s)
  • TS SDK feature parity — retry logic, ClaudeCodeEvents, agentName/agentVersion, cacheCreateTokens, configurable serviceName, updated README
  • MCP auto-registrationocw onboard --claude-code auto-registers the OCW MCP server via claude mcp add; ocw uninstall deregisters it

Bug Fixes

  • Pricing path off by one levelPRICING_FILE resolved to site-packages/pricing/ instead of site-packages/ocw/pricing/ when pip-installed, causing all costs to show $0.000000
  • PosixPath not TOML serializable_serialise() included config_path (a Path object); now skips Path instances
  • Claude Code telemetry pipeline — daemon ran without --config, picking up stale project config whose ingest_secret didn't match ~/.claude/settings.json → every telemetry POST silently 401'd
  • Global config for Claude Code--claude-code now always writes to ~/.config/ocw/config.toml (shared across projects) instead of per-project .ocw/config.toml
  • Daemon reinstalllaunchctl unload before load on reinstall so old registration is replaced cleanly
  • Uninstall ordering — read projects.json before deleting ~/.config/ocw/ (was read after, always missed)
  • MCP error handling — all MCP tool calls wrapped in try/except; field name normalization (cost_todaycost_today_usd); HTTP mode fixes for agents, sessions, auth headers, and acknowledge proxying
  • CORS — add PATCH to allowed methods
  • Publish workflow — install [mcp] extra for tests (matches ci.yml)

Cleanup

  • Remove redundant --install-daemon flag from ocw onboard (auto-installs by default; --no-daemon remains)
  • Add daemon auto-install notice in --claude-code path so users know a service is being installed
  • Update contributor install instructions with editable + MCP extras
  • Rewrite Claude Code integration docs for current install/uninstall flow
  • Update manual test playbooks — modernize pre-release and post-release checklists for daemon lifecycle, budget commands, MCP verification, and Claude Code onboarding

Tests

  • Add regression tests for pricing path, config serialization roundtrip, and daemon prompt behavior