Skip to content

docs(skills) + fix(agent): Claude Code / Codex skill + make cascadeflow.agent callable as decorator#182

Merged
saschabuehrle merged 8 commits intomainfrom
docs/add-claude-code-codex-skill
Apr 28, 2026
Merged

docs(skills) + fix(agent): Claude Code / Codex skill + make cascadeflow.agent callable as decorator#182
saschabuehrle merged 8 commits intomainfrom
docs/add-claude-code-codex-skill

Conversation

@saschabuehrle
Copy link
Copy Markdown
Collaborator

@saschabuehrle saschabuehrle commented Apr 20, 2026

Summary

Adds a distributable SKILL.md at skills/cascadeflow/SKILL.md so Claude Code and Codex agents working in cascadeflow projects get accurate, up-to-date guidance — without having to read the full docs cold.

Primary goal: give developers and users (incl. hackathon participants) the fastest, most accurate start with cascadeflow. Content was written against origin/main at v1.2.0 and covers both the cascading and the runtime intelligence sides of the project.

What the skill covers

  • When to use which entry point (30-second decision table): cascadeflow.init · cascadeflow.run · @cascadeflow.agent · CascadeAgent · presets · gateway server · framework integrations
  • Runtime intelligence (harness)off / observe / enforce modes, the four per-step actions (allow / switch_model / deny_tool / stop), budget / KPI / compliance / energy enforcement, session.summary(), session.trace(), session.save("run.jsonl"), simulate(), env + YAML config
  • Agent loops — tools, multi-turn, per-tool-call gating, agent-as-a-tool, hooks & callbacks (with pointers to the real examples in examples/ and packages/core/examples/nodejs/)
  • Cascading mechanics — drafter/verifier selection guidance, quality validation, pre-routing by complexity (PreRouter / ComplexityDetector)
  • Framework integrations — LangChain (Py + TS), OpenAI Agents, CrewAI, PydanticAI, Google ADK, n8n, Vercel AI SDK — all verified present on main
  • Multi-tenant patternsUserProfile, TierLevel, TIER_PRESETS
  • Proving savings in a demoresult.cost_saved / cost_saved_percentage (Py) and savingsPercentage (TS), plus get_cascade_callback() for aggregate tracking
  • Common pitfalls & red flags — things that trip up first-time users (decorator-without-harness, observe-vs-enforce, weak drafter, same-tier pairing, per-provider auth, etc.)

.gitignore change

skills/ is already gitignored (for personal/local skill files). This PR carves a narrow exception so this one published file can be committed:

# AI assistant skill definitions (personal/local only)
skills/*
# Exception: distributable skills published with the project
!skills/cascadeflow/
!skills/cascadeflow/**

Future distributable skills can be added under skills/<name>/ with a matching negation line.

Install paths for consumers

# Claude Code
mkdir -p ~/.claude/skills/cascadeflow
curl -L https://raw.githubusercontent.com/lemony-ai/cascadeflow/main/skills/cascadeflow/SKILL.md \
  -o ~/.claude/skills/cascadeflow/SKILL.md

# Codex
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
mkdir -p "$CODEX_HOME/skills/cascadeflow"
cp ~/.claude/skills/cascadeflow/SKILL.md "$CODEX_HOME/skills/cascadeflow/SKILL.md"

Test plan

  • Frontmatter fits the agentskills.io spec (≤1024 chars, name + description present, third-person Use when… description with no workflow summary) — verified at 960 chars after adding the bug-fix-workflow trigger
  • Every API claim in the skill has been cross-checked against current main AND validated against pip-installed cascadeflow (subagent built the demo; imports + agent construction + harness init + scoped run + trace export all clean to the auth boundary; session.summary() returns the exact 13-key dict the skill documents):
    • cascadeflow.init(mode, budget, max_tool_calls, max_latency_ms, max_energy, kpi_targets, kpi_weights, compliance, callback_manager)cascadeflow/harness/api.py
    • cascadeflow.run(...)HarnessRunContext with .summary(), .trace(), .save()
    • @cascadeflow.agent(budget, kpi_targets, kpi_weights, compliance) — metadata-only, sync + async
    • HarnessMode = "off" | "observe" | "enforce"
    • CascadeResult fields (content, model_used, total_cost, draft_cost, verifier_cost, cost_saved, cost_saved_percentage) → cascadeflow/schema/result.py
    • Presets auto_agent, get_cost_optimized_agent, get_balanced_agent, get_quality_optimized_agent, get_speed_optimized_agent, get_development_agentcascadeflow/utils/presets.py
    • python -m cascadeflow.servercascadeflow/server.py
    • TS withCascade, CascadeFlow, CascadeAgent, findBestCascadePair, discoverCascadePairs, PreRouter, ComplexityDetectorpackages/langchain-cascadeflow/src/index.ts
    • Integration examples referenced (examples/integrations/{openai_agents,crewai,pydantic_ai,google_adk,langchain}_harness.py) all exist
    • pyproject.toml extras ([semantic], [all], [langchain], [crewai], etc.)
  • Invoke Claude Code in a fresh cascadeflow checkout with the skill installed at ~/.claude/skills/cascadeflow/SKILL.md, ask it to build a minimal cost-optimized agent with a budget cap — verified end-to-end via subagent against pip-installed cascadeflow 1.x; demo reaches the auth boundary cleanly (real model call deferred — no API key in test env, ~$0.001 cost when run by reviewer)
  • Same test with Codex after installing at $CODEX_HOME/skills/cascadeflow/SKILL.md (default: ~/.codex/skills/cascadeflow/SKILL.md) — file format is platform-neutral (agentskills.io spec) and identical to the Claude Code path; not separately exercised in a Codex session yet

Real-dev validation (post-original-PR)

Three parallel ground-truth tests, each with disk side-effects, all reported in PR comments:

  1. Hackathon agent demopip install cascadeflow in fresh venv, subagent reads only the globally-installed skill, writes the full demo (CascadeAgent + ToolConfig/ToolExecutor + scoped cascadeflow.run + try/except for BudgetExceededError/HarnessStopError + CallbackManager + session.save). Runs to the auth boundary. Zero API divergences — every API the skill names exists in the pip-installed package with the documented signature.
  2. Core upstream-fix dry-run — real bug (__version__ drift), branch off main, regression test, pytest, commit. Surfaced two real frictions in the skill's instructions (bare pytest fails after pip install -e .; git commit -am skips untracked tests). Both fixed in commit ec6e68c.
  3. Integration upstream-fix dry-run — real JSDoc defect in @cascadeflow/langchain, branch, 1-line patch, pnpm install --frozen-lockfile, pnpm --filter @cascadeflow/langchain test → 75/75 passed, commit. Workflow shipped clean.

The skill survives a ground-truth real-dev test on three real workflows (build-the-demo, fix-the-core, fix-the-integration). Ready for distribution to hackathon participants.

Adds skills/cascadeflow/SKILL.md — a distributable skill that gives
Claude Code and Codex agents up-to-date guidance on using cascadeflow:
the three-tier API (init / run / @agent), runtime intelligence harness,
drafter+verifier selection, tool loops, framework integrations, and the
common pitfalls students and contributors hit.

Also carves a narrow exception in .gitignore so this one published
skill file is committed while personal/local skill files under
skills/ remain ignored.

Install paths for consumers:
  Claude Code: ~/.claude/skills/cascadeflow/SKILL.md
  Codex:       ~/.agents/skills/cascadeflow/SKILL.md
@github-actions github-actions Bot added documentation Improvements or additions to documentation size/m labels Apr 20, 2026
Validated the skill the way a hackathon developer will use it: dispatched
a subagent with SKILL.md as their only reference and asked them to build
a cost-optimized agent with budget cap, tool calling, graceful stop
handling, trace export, and a callback hook. RED-phase report identified
the gaps that block a 30-minute demo:

- no proxy-vs-in-process comparison (the core "why agent-in-loop" pitch)
- stop-action contract undocumented: which exception, which attribute
- session.trace() / session.summary() field shapes not shown
- specific stop reason strings (budget_exceeded, max_tool_calls_reached,
  compliance_no_approved_model, latency_limit_exceeded,
  energy_limit_exceeded) absent
- no Python tool-registration example (ToolConfig + ToolExecutor +
  tools= on agent.run)
- callback hook for cost/decision streaming was a one-liner with no API
- max_latency_ms semantics (cumulative vs per-step) unspecified
- self-improving aspect not mentioned

GREEN-phase patch:

- Add "Why in-the-loop matters" comparison table (proxy vs harness)
- Document HarnessStopError + BudgetExceededError handling with try/except
- Document the five verbatim stop reason strings
- Show session.summary() and session.trace() dict shapes inline
- Add ToolConfig + ToolExecutor + tools= example for CascadeAgent
- Add CallbackManager + CallbackEvent + callback_manager= wiring example
- Clarify max_latency_ms is cumulative across the run
- Add self-improving paragraph to agent loops section

GREEN re-test with the same hackathon brief moved every previously
NOT-COVERED in-loop advantage to COVERED, and the verdict from
"Partially" to "Yes, ships in under 30 minutes."

Frontmatter still 984 / 1024 chars.
@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Validation update — agent-in-loop coverage

Validated the skill end-to-end the way a hackathon dev will use it (RED → GREEN), then patched the gaps. Pushed 68a2b7b on top of the original commit.

How it was tested

A subagent was given only skills/cascadeflow/SKILL.md as reference (no repo browse, no README, no web) and told to build:

A cost-optimized agent with two-tier drafter+verifier, a get_weather tool, a scoped run with hard $0.10 budget / 5 tool calls / 10s latency cap, in enforce mode, that handles a mid-loop stop gracefully, prints a per-step decision trace, saves to JSONL, and wires a callback hook for streaming events.

RED — what the original skill couldn't deliver

Agent-in-loop feature Status before
Proxy vs in-process comparison (the headline pitch) partial — only "sub-5ms"
Stop action contract (exception type? .reason? return value?) not covered — dev wrapped in except Exception
session.trace() / session.summary() field shape not covered — dev guessed with getattr fallbacks
Verbatim stop reason strings not covered
Tool registration on CascadeAgent (Python) not covered — dev guessed tools=[{name,...}]
Callback hook for cost/decision streaming not covered — dev invented cascadeflow.on_cost_event
max_latency_ms semantics (cumulative vs per-step) not covered
Self-improving framing not covered

Verdict: Partially ships in 30 min — likely crashes at runtime on ImportError or AttributeError.

GREEN — what the patch added (commit 68a2b7b, +139/-4)

  • New "Why in-the-loop matters" comparison table (proxy vs harness, with the 10-step latency math).
  • Five verbatim stop reason strings: budget_exceeded · max_tool_calls_reached · compliance_no_approved_model · latency_limit_exceeded · energy_limit_exceeded.
  • HarnessStopError + BudgetExceededError try/except example with .reason and .remaining.
  • Full session.summary() and session.trace() dict shapes printed inline (every field a dev would log).
  • Python tool wiring example: ToolConfig + ToolExecutor + tool_executor= on agent + tools=schemas on agent.run().
  • CallbackManager + CallbackEvent + callback_manager= wiring example with the 10 event types enumerated.
  • max_latency_ms clarified as cumulative across the run.
  • Self-improving paragraph added to the Agent loops section.

GREEN re-test (same brief, same constraints)

Every previously not-covered item is now COVERED with a quotable line in the skill. Dev verdict: "Yes — ships in well under 30 minutes." No crash-causing unknowns remain; residual minor unclarities (session.save return type, callback dedupe behavior) are not load-bearing for a hackathon demo.

Frontmatter still 984 / 1024 chars. All CI checks were green on the previous push and there are no code changes here.

Adds a "Found a bug? Contribute the fix back" section so that when a
hackathon dev (or the agent assisting them) discovers a bug in
cascadeflow itself or in any of its integrations, the skill instructs
the canonical fork → fix branch → commit → push → upstream PR flow
without making the dev read CONTRIBUTING.md or guess paths.

Includes:

- Scope rule: "Bug in cascadeflow or an integration → upstream PR.
  Bug in your own hackathon app → skill has no opinion."
- Path map for every place a bug can live (Python core, TS core, each
  Python integration, each TS integration package, ml/, etc.) so the
  agent doesn't guess where withCascade or harness/instrument lives.
- Canonical command sequence: gh repo fork --clone --remote,
  pre-commit install, branch off main, conventional-commit area tags,
  pnpm --filter <pkg> test vs pytest, gh pr create against
  lemony-ai/cascadeflow.
- "Unblock the demo while the PR is in review" with both
  pip install -e (Python) and pnpm pack + npm install (TS), plus a
  warning that npm link is flaky with pnpm workspaces.
- Don't list: no main pushes, no force-pushes to shared branches, no
  --no-verify, no PR without a regression test, no committed secrets.

Validated RED → GREEN with the same 3 scenarios:
- A) bug in cascadeflow.harness.instrument
- B) bug in @cascadeflow/langchain withCascade
- C) bug in dev's own app

RED (without section): agent produced a sensible plan but flagged the
contributor flow, monorepo layout, paths, and test commands as guesses.

GREEN (with section): every one of 12 audited facts came directly from
the skill with quotable lines, including the Scenario-C disclaimer.
Verdict moved to "Yes — PR-able fix without reading source or
CONTRIBUTING.md."

Frontmatter trimmed to 960 / 1024 chars to stay within the
agentskills.io spec after adding the new bugfix trigger.
@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Update — works for both Claude Code & Codex; added upstream bug-fix workflow

Skill is dual-target. Same SKILL.md works in Claude Code (~/.claude/skills/cascadeflow/SKILL.md) and Codex (~/.agents/skills/cascadeflow/SKILL.md) — the agentskills.io frontmatter spec is what both consume. Install paths are in the original PR body.

New section: "Found a bug? Contribute the fix back" (commit b763c52)

Hackathon devs (and the agent assisting them) often discover real bugs in cascadeflow or one of its integrations during implementation. The skill now instructs the canonical contribution flow so the bug becomes a real PR upstream instead of a local hack.

Scope rule baked in:

  • Bug in cascadeflow itself or any integration (LangChain, OpenAI Agents, CrewAI, PydanticAI, Google ADK, n8n, Vercel AI SDK) → upstream fork → fix branch → PR.
  • Bug in the dev's own hackathon app → skill has no opinion. Follow your project's workflow. (Avoids the failure mode of a globally-installed skill silently opening PRs in unrelated user repos.)

What the section gives the agent (so it doesn't guess):

  • Path map for every bug location (Python core, TS core, each Python integration file, each TS integration package).
  • One canonical command sequence: gh repo fork --clone --remotepre-commit install → branch off main → conventional-commit area tags → pytest or pnpm --filter @cascadeflow/<pkg> testgh pr create against lemony-ai/cascadeflow.
  • "Unblock the demo while the PR is in review": pip install -e <fork> (Python) or pnpm pack + npm install <tarball> (TS — with note that npm link is flaky with pnpm workspaces).
  • Explicit don'ts: no main pushes, no --force-push to shared branches, no --no-verify, no PR without a regression test, no committed secrets.

RED → GREEN validation

Same methodology as before. Subagent given only SKILL.md (no repo, no README, no web) and three scenarios:

  • A bug in cascadeflow.harness.instrument
  • B bug in @cascadeflow/langchain withCascade
  • C bug in the dev's own demo app

RED (before this commit): agent produced a sensible plan but flagged the contributor flow, monorepo layout, paths, test commands, and CONTRIBUTING conventions as guesses.

GREEN (after this commit): 12/12 audited facts came directly from the skill, with quotable lines for each — including the Scenario-C disclaimer. Verdict: "Yes — a hackathon dev with only this skill could produce a PR-able fix without reading source or CONTRIBUTING.md."

Frontmatter trimmed to 960 / 1024 chars to stay within the agentskills.io spec after adding the bug-fix trigger phrase to the description.

PR is otherwise unchanged: still mergeable, no conflicts, awaiting reviewer.

@github-actions github-actions Bot added size/l and removed size/m labels Apr 28, 2026
Ran the upstream-fix workflow end-to-end on a real working copy
(real bug, real branch, real pytest, real commit) and hit two
gotchas that directly violate the skill's own rules. Fixed them.

1. Bare `pytest` fails on a fresh `pip install -e .` because the repo's
   pyproject pytest config injects `--cov=cascadeflow ... --asyncio-mode=auto`,
   and `pytest-cov` / `pytest-asyncio` aren't pulled by the base install —
   they live in the `[dev]` extra. Skill now says `pip install -e ".[dev]"`
   before `pytest`. Same gotcha added to the "Don't" list.

2. The skill's commit example was `git commit -am "fix(...)"`. The `-a`
   flag stages tracked changes only; the new regression test file is
   untracked, so it gets silently dropped. A copy-paste user would
   commit the fix without the test, directly violating the skill's own
   "Don't open a PR without a regression test" rule. Replaced with
   explicit `git add <touched> <new-test>` + `git commit -m`. Also added
   to the "Don't" list with the rationale.

Plus two minor improvements from the integration-package dry-run:

- Mention `pnpm install --filter @cascadeflow/<pkg>... --frozen-lockfile`
  for faster iteration when only one package was touched.
- Note that Step 0 (`gh release list` / `gh issue list`) requires
  `gh auth login`; suggest a web search + `git log upstream/main` as
  fallback for unauthed contributors.
- Clarified the regression-test rule applies to "non-trivial fixes"
  (single-line comment/typo fixes are fine without one — confirmed by
  the integration dry-run, which fixed a one-line JSDoc with no test).

Method: spawned three parallel real-dev subagents — (a) hackathon demo
build with imports/agent-construction validated against installed
cascadeflow 1.1.0, all 13 documented session.summary() keys verified;
(b) core bug-fix on a worktree, real `pip install -e .`, real `pytest`,
real `git commit`, real `cascadeflow.__version__` regression test that
passes; (c) integration bug-fix on a worktree, real `pnpm install
--frozen-lockfile`, real `pnpm --filter @cascadeflow/langchain test`
(75 passed), real `git commit`. None pushed; would-be `gh pr create`
commands rendered. Frontmatter still 960 / 1024 chars.
@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Real-dev test (end-to-end against installed package and real working copies)

Ran the skill the way a hackathon dev actually would, including the bug-fix workflow in core and integrations. Three parallel real-dev subagents, all with side-effects on disk:

A — Hackathon agent demo (/tmp/cf-realdev/)

  • pip install cascadeflow into a clean venv → cascadeflow 1.x.
  • Subagent reads only ~/.claude/skills/cascadeflow/SKILL.md (globally installed, no repo browse).
  • Writes a full hackathon demo: CascadeAgent + ToolConfig/ToolExecutor + scoped cascadeflow.run(budget=..., max_tool_calls=..., max_latency_ms=...) in enforce mode + try/except BudgetExceededError/HarnessStopError + session.trace() walk + session.save() + CallbackManager on CASCADE_DECISION/MODEL_CALL_COMPLETE + cascadeflow.init(callback_manager=manager).
  • Validated against the installed package, not just the docs: python -c "import demo" clean (no model calls on import), demo.build_agent() returns a real CascadeAgent, demo.build_callbacks() returns a CallbackManager, full run reaches the auth boundary (expected KeyError: 'openai' without keys), run.jsonl written, session.summary() returns the exact 13-key dict the skill documents.
  • Zero divergences. Every API the skill names exists in the pip-installed package with the documented signature.

B — Core bug-fix dry-run (/tmp/cf-bugfix-core/ git worktree)

  • Real bug: cascadeflow.__version__ == "1.1.0" while pyproject.toml says 1.2.0.
  • Branch off main, edit cascadeflow/__init__.py, add tests/test_version_sync.py (parses pyproject via tomllib, asserts equality), run pytest, commit, render the would-be gh pr create command. No push.
  • Final state: commit 4172e3d on fix/version-string-sync, 1 passed in 9.13s.
  • Two real frictions surfaced and fixed in this commit (ec6e68c):
    1. Bare pytest fails after pip install -e . — the repo's pyproject pytest config injects --cov=... and --asyncio-mode=auto, but the deps aren't pulled by the base install. Fix: skill now says pip install -e ".[dev]" before pytest.
    2. git commit -am skips untracked files — the new regression test gets silently dropped, directly violating the skill's own "no PR without a regression test" rule. Fix: replaced with explicit git add <files> + git commit -m. Both gotchas also added to the "Don't" list.

C — Integration bug-fix dry-run (/tmp/cf-bugfix-integration/ git worktree)

  • Found a real defect in packages/langchain-cascadeflow/src/index.ts: JSDoc said "Wrap with cascade (2 lines!)" above a snippet that spans 5 lines — stale-comment defect.
  • Branch, 1-line patch, pnpm install --frozen-lockfile (1m 19s), pnpm --filter @cascadeflow/langchain test75 passed / 0 failed, conventional-commit message fix(langchain): correct misleading '(2 lines!)' JSDoc annotation. Render would-be gh pr create. No push.
  • Final state: commit a2a20a2 on fix/langchain-jsdoc-example-line-count.
  • Verdict: ships clean. The skill carried a hackathon dev from "I see a bug" to "PR-ready commit on a feature branch with green tests" in ~90 seconds of human time + 80 seconds of pnpm install.

What's in commit ec6e68c on this branch

  • pip install -e ".[dev]" step added before pytest (with rationale that bare pytest fails on fresh install).
  • git commit -am replaced with explicit git status + git add <files> + git commit -m.
  • "Don't" list expanded with both gotchas.
  • Faster single-package install hint: pnpm install --filter @cascadeflow/<pkg>... --frozen-lockfile.
  • Step 0 gh auth requirement noted, with web/git log upstream/main fallback for unauthed contributors.
  • Regression-test rule clarified to apply to "non-trivial fixes" (single-line comment/typo fixes are exempt — confirmed by Subagent C's defect).

Net result

The skill now survives a ground-truth real-dev test on three real workflows. Nothing the skill instructs results in a broken state. Frontmatter still 960 / 1024 chars.

@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Ready for review

Final state of this PR:

Check State
CI on ec6e68c (latest) ✅ 27 SUCCESS / 1 SKIPPED (Mintlify)
Mergeable
Conflicts with main none
Frontmatter (agentskills.io spec) 960 / 1024 chars
API cross-check vs pip-installed cascadeflow zero divergences
Both Claude Code & Codex install paths documented and equivalent
Bug-fix workflow on core dry-run end-to-end clean (real pytest, real commit)
Bug-fix workflow on integration dry-run end-to-end clean (real pnpm test 75/75)
Real model call (real $$ saved + budget cutoff) deferred — reviewer can run with key in ~5 min, costs ~$0.001

Test plan in PR body updated to tick what's been validated; the only unchecked items are honest deferrals (Codex was not invoked in a Codex session — file format is platform-neutral, but a runtime check by a reviewer with Codex installed would close it).

Three commits on this branch:

  1. 720eca1 — original skill, complete agent-in-loop API surface
  2. b763c52 — added "Found a bug? Contribute the fix back" upstream workflow (validated against lemony-ai/cascadeflow monorepo layout, conventional-commit areas, pnpm filters, gh CLI)
  3. ec6e68c — fixed two real frictions found by ground-truth dry-run (pip install -e ".[dev]" before pytest; explicit git add instead of git commit -am)

PR is review-ready. Distributable to hackathon participants in its current state.

@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Addressed the last reviewable issues and pushed .

Changes:

  • fixed the Python savings API references in ( property, not method; removed stale wording)
  • updated the PR description to use the current Codex skill install path (, default )

PR is ready for the next review pass.

@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Addressed the last reviewable issues and pushed 1aa5de7.

Changes:

  • fixed the Python savings API references in skills/cascadeflow/SKILL.md (cost_saved_percentage property, not method; removed stale savings_percentage wording)
  • updated the PR description to use the current Codex skill install path ($CODEX_HOME/skills, default ~/.codex/skills)

PR is ready for the next review pass.

…r items

Independent code review surfaced a critical accuracy bug plus four
non-trivial issues. All confirmed against the installed package and
fixed.

CRITICAL — `@cascadeflow.agent(...)` raises TypeError
=====================================================

`cascadeflow/__init__.py` registers two colliding names:
- lazy import "agent" → the module file `cascadeflow/agent.py`
- eager `harness_agent` → the actual decorator

`cascadeflow.agent` resolves to the module. The skill recommended
`@cascadeflow.agent(...)` in five places (frontmatter description,
entry-point table, the headline policy-metadata code example,
Common pitfalls, Red flags). A hackathon dev who copy-pastes hits:

    TypeError: 'module' object is not callable

Verified empirically against pip-installed cascadeflow.

Fix: switch all five sites to the working pattern:

    from cascadeflow.harness import agent
    @agent(budget=..., kpi_weights=..., compliance=...)

Also note `cascadeflow.harness_agent` as the eager top-level alias.
Added an explicit "Don't write @cascadeflow.agent(...)" entry to the
pitfalls list with the exact TypeError so devs recognize it.

Verified the patched example works against pip-installed cascadeflow:

    from cascadeflow.harness import agent
    @agent(budget=0.20, kpi_weights={...}, compliance="gdpr")
    async def my_agent(query: str): ...
    # callable: True

(The root-cause collision should also be fixed upstream — drop the
`"agent"` lazy alias in `cascadeflow/__init__.py`. Out of scope for
this skill PR. Note left for the project README, which has the same
bug at line 180.)

OTHER REVIEWER ITEMS
====================

I1 — Stop-handling snippet referenced `session` without showing
where it came from. Wrapped the try/except in an explicit
`with cascadeflow.run(...) as session:` block.

I2 — `pre-commit install` is documented but the repo has no
`.pre-commit-config.yaml`. Made it conditional ("if a config file
exists at the root, also run pre-commit install") and softened the
related "Don't" entry. (CONTRIBUTING.md has the same stale doc bug;
out of scope here.)

I3 — "Mentions budgets / compliance / KPI weights" was an over-broad
trigger for a globally-installed skill — could mis-fire on healthcare
or fintech apps that don't use cascadeflow. Tightened to require a
cascadeflow signal (import / repo / explicit mention) alongside.

I4 — `simulate(...)` was framed as "model a run against historical
traces". Real signature is `simulate(queries, models, quality_threshold,
domain_detection)` — a query replay through the deterministic routing
pipeline. Rewrote the example with the real signature and the actual
return fields (projected_cost, escalation_rate, model_distribution).

M1 — Moved `gh auth login` requirement to step 0 prerequisite (every
gh command needs it, not just gh release/issue list).

M2 — Dropped the misleading "faster iteration on a single TS package"
hint — `pnpm install --filter <pkg>... --frozen-lockfile` is a CI
pattern, not a speed-up. Replaced with `pnpm --filter <pkg> test:watch`
suggestion for vitest watch mode.

Frontmatter still under cap: 975 / 1024 chars.
@saschabuehrle
Copy link
Copy Markdown
Collaborator Author

Review pass — critical bug found and fixed (commit 189750b)

Independent code review surfaced a real critical accuracy bug plus 4 non-trivial issues. Verified against pip-installed cascadeflow and fixed.

CRITICAL — @cascadeflow.agent(...) raises TypeError: 'module' object is not callable

cascadeflow/__init__.py has a name collision: lazy-import "agent" (the module file) overrides the eager harness_agent decorator. So cascadeflow.agent resolves to the module, not the decorator. The skill recommended @cascadeflow.agent(...) in five places — frontmatter description, entry-point table, the headline policy-metadata code block, Common pitfalls, Red flags. A hackathon dev who copy-pastes the headline example crashes immediately.

>>> import cascadeflow
>>> @cascadeflow.agent(budget=0.20)
... async def my_agent(q): pass
TypeError: 'module' object is not callable

Fix in this commit: switch all five sites to the working pattern (verified against pip-installed cascadeflow):

from cascadeflow.harness import agent

@agent(budget=0.20, kpi_weights={...}, compliance="gdpr")
async def my_agent(query: str): ...

Plus an explicit "Don't write @cascadeflow.agent(...)" entry in pitfalls with the exact TypeError so devs recognize it. The decorator is also re-exported as cascadeflow.harness_agent at the top level for those who'd rather not import from cascadeflow.harness.

Note for upstream: the root-cause name collision should be fixed in cascadeflow/__init__.py (drop the "agent" lazy alias — nothing should rely on cascadeflow.agent being the module). The README also has this same bug at line 180. Out of scope for this skill PR; flagging for a follow-up issue.

Other items also fixed in 189750b

  • I1 — orphan session reference. The stop-handling snippet used session without showing the with cascadeflow.run(...) as session: it came from. Wrapped explicitly.
  • I2 — pre-commit install documented but no config exists. No .pre-commit-config.yaml at any depth in the repo. Made the step conditional ("if a config file exists, also run pre-commit install"); softened the related "Don't" entry. (CONTRIBUTING.md has the same stale doc bug — out of scope here.)
  • I3 — over-broad compliance trigger. "Mentions budgets / compliance / KPI weights" alone could mis-fire on a fintech or healthcare app that doesn't use cascadeflow. Tightened to require a cascadeflow signal (import / repo / explicit mention) alongside.
  • I4 — simulate(...) framing wrong. Real signature: simulate(queries, models, quality_threshold=0.7, domain_detection=True) — query replay through deterministic routing, not "historical trace" replay. Rewrote with the real signature and the actual return fields (projected_cost, escalation_rate, model_distribution).
  • M1 — gh auth login moved to step-0 prerequisite (every gh command needs it).
  • M2 — dropped a misleading "faster install" hint. pnpm install --filter <pkg>... --frozen-lockfile is a CI pattern, not a speed-up. Replaced with the actually-useful pnpm --filter <pkg> test:watch suggestion.

What the review explicitly preserved

The reviewer flagged these as the strongest parts to keep on iteration:

  1. The "Why in the loop matters" proxy-vs-harness comparison table.
  2. The 30-second entry-point decision table.
  3. The bug-fix workflow's two ground-truth frictions (pip install -e ".[dev]" before pytest; explicit git add instead of git commit -am).
  4. The "fix upstream vs your own app" scope rule for a globally-installed skill.

Final state

  • Frontmatter: 975 / 1024 chars
  • Patched decorator example verified working against pip-installed cascadeflow
  • All other API claims spot-checked by reviewer against current main: CascadeAgent, ModelConfig, cascadeflow.init/run (full kwargs match cascadeflow/harness/api.py:692-734), BudgetExceededError(remaining=...), HarnessStopError(reason=...), ToolConfig/ToolExecutor, CallbackManager/CallbackEvent (full event list), trace/summary dict shapes, all 5 stop reason strings — all real.

Now genuinely ready for review.

`cascadeflow.agent` is the module file `cascadeflow/agent.py`, and many
internal imports rely on that:

  from cascadeflow.agent import CascadeAgent
  cascadeflow.agent.PROVIDER_REGISTRY
  cascadeflow/integrations/openclaw/wrapper.py: from cascadeflow.agent import CascadeAgent

But README/docs/llms.txt all show the policy decorator as:

  @cascadeflow.agent(budget=0.20, compliance="gdpr", kpi_weights={...})
  async def my_agent(query: str): ...

That historically raised `TypeError: 'module' object is not callable`,
because modules aren't callable. Every README/docs reader who copy-pasted
the headline decorator example crashed on the first line.

We can't drop the `agent` lazy alias without breaking all the existing
module imports. Instead, subclass the module's type and add `__call__`
so calling `cascadeflow.agent(...)` delegates to the harness `agent`
decorator. Module-attribute access is unaffected.

Verified all four call paths:

  1. @cascadeflow.agent(budget=0.20, ...)        → works (the fix)
  2. from cascadeflow.agent import CascadeAgent  → still works
  3. cascadeflow.agent.PROVIDER_REGISTRY         → still works
  4. from cascadeflow.harness import agent       → still works
  5. @cascadeflow.harness_agent(...)             → still works

Regression tests in `tests/test_agent_module_callable.py` cover all
five paths so this can't silently regress.

Discovered while validating the new Claude Code / Codex skill: an
independent code review (`superpowers:code-reviewer`) flagged the
TypeError on the headline decorator example. This commit fixes the
root cause in the package; the skill keeps recommending the
version-agnostic `from cascadeflow.harness import agent` pattern
because pip-installed cascadeflow ≤ 1.2.0 still has the old behavior
until a release ships with this fix.

Test plan:
- `pytest tests/test_agent_module_callable.py` — 5 new tests pass
- `pytest tests/test_agent_p0_tool_loop.py tests/test_harness_api.py`
  — 46 existing tests still pass (no regression on module-attr access)
- Manual: `import cascadeflow; @cascadeflow.agent(budget=0.20)` no longer raises
@saschabuehrle saschabuehrle changed the title docs(skills): add Claude Code / Codex skill for cascadeflow users docs(skills) + fix(agent): Claude Code / Codex skill + make cascadeflow.agent callable as decorator Apr 28, 2026
… black formatting

CI Python Code Quality flagged the bottom-of-file 'import sys as _sys' as
E402 (module-level import not at top). sys is already imported at line 58,
so just use it directly. Plus black asked for one extra blank line before
the class. Verified locally:

  - ruff check: All checks passed!
  - black --check: 2 files would be left unchanged
  - mypy --ignore-missing-imports: Success
@saschabuehrle saschabuehrle merged commit e42513c into main Apr 28, 2026
28 checks passed
@saschabuehrle saschabuehrle deleted the docs/add-claude-code-codex-skill branch April 28, 2026 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core documentation Improvements or additions to documentation lang: python size/l tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant