feat: tool-result scanning for langchain / openai-agents / genkit / l…#9
feat: tool-result scanning for langchain / openai-agents / genkit / l…#9
Conversation
…lamaindex (0.15.0)
0.14 wired tool-result scanning into the Mastra processor + MCP adapter
only. 0.15 rolls the same protection out to the four other adapters
that already do tool wrapping at construction time:
- LangChain — wraps tool.invoke (both governTool and governTools)
- OpenAI Agents — wraps tool.invoke AND tool.execute
- Genkit — wraps tool.call
- LlamaIndex — wraps tool.call
Each adapter gains a `createResultScanner` closure (factory bound to
governance instance + agentId + config) and the existing wrapTool calls
the new closure between `await tool.invoke/call/execute(args)` and the
audit + return. The scanner runs scanToolResult at stage tool_result;
on block the redacted detail object replaces the output before it
reaches the agent loop.
Each config gains:
scanToolResults?: boolean // default true, opt-out via false
toolResultInjectionThreshold?: number // local detection threshold, default 0.5
What didn't change in 0.15:
- Anthropic / Mistral / Ollama — caller-driven handleToolUse pattern;
tool-result scanning has to be integrated at the call site.
- Vercel AI — no tool-wrapping path on this adapter; follow-up needed.
- Bedrock — entry-gate only; AWS executes tools internally.
- Mastra middleware adapter (mastra.ts, not the processor) — different
wrap shape; coverage to follow.
Drop-in upgrade. No public type breakage. Tests that mock tool returns
may need `scanToolResults: false` to skip the helper.
1,372 tests, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bugbot caught: the LangChain wrap unsafely cast `input as Record<string, unknown> | undefined` before handing it to scanResult. LangChain's DynamicTool inputs are commonly strings, and an unchecked cast would let a string flow through to ctx.input — typed as Record<string, unknown> in EnforcementContext. Condition evaluators reading properties off ctx.input would silently get undefined and never match, defeating tool_result-stage scanning for the most common DynamicTool shape. Mirror the same `typeof input === "object" && input !== null` guard that createEnforcer already uses on its own input field. Strings now pass through as `undefined` for the args field, which is the correct behaviour — the tool's text content still gets scanned via ctx.outputText (which scanToolResult populates from the tool's return value, regardless of input shape). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings on the 0.15.0 wrapper rollout:
1. LlamaIndex — BlockedToolResult.ruleId is `string | null`, but
LlamaIndexJSONValue explicitly excludes null per the SDK contract.
The unchecked cast `as LlamaIndexJSONValue` would let a null ruleId
slip through and trip downstream JSON walkers expecting only
string|number|boolean|array|object. Coerce to "unknown" on block
so the substitute is a valid JSONValue shape.
2. OpenAI Agents — the SDK types `invoke` as Promise<string> (the
value flows into the Responses API's function_call_output, which
requires a string output field). When scanToolResult substitutes a
BlockedToolResult object on block, returning it raw means the SDK
serialises it as `[object Object]` when building the API payload.
JSON.stringify on the block path so the LLM gets a parseable
{"blocked":true,"reason":"...","ruleId":"..."} string instead.
The execute() path on OpenAI Agents stays unchanged — it's a
governance-wrapper convenience (not in the SDK), so callers there
can already accept arbitrary shapes.
Tests: 1,372, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 276e651. Configure here.
| }); | ||
| return scanned.result; | ||
| }; | ||
| } |
There was a problem hiding this comment.
LangChain scanner omits agentLevel causing false blocks
Medium Severity
The LangChain createResultScanner does not pass agentLevel to scanToolResult, while createEnforcer uses the real result.level from agent registration. The agent_level condition evaluator defaults missing agentLevel to 0, so agents registered at level ≥ 1 will be treated as level 0 during tool-result scanning. This causes agent_level rules to produce false-positive blocks on tool results for trusted, higher-level agents that correctly pass pre-execution enforcement.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit 276e651. Configure here.
…y pass The auto-generated release notes only covered #9 (tool-result adapters). Code for #10 (multi-modal scan) and #11 (README honesty pass) shipped in 0.15.0 but neither got a CHANGELOG entry — the auto-release pulled from CHANGELOG.md so the GitHub Release body and the npm-displayed changelog were both incomplete. This commit extends the 0.15.0 entry with both missing sections. GitHub Release body has been updated to match. No code change; documentation only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


…lamaindex (0.15.0)
0.14 wired tool-result scanning into the Mastra processor + MCP adapter only. 0.15 rolls the same protection out to the four other adapters that already do tool wrapping at construction time:
Each adapter gains a
createResultScannerclosure (factory bound to governance instance + agentId + config) and the existing wrapTool calls the new closure betweenawait tool.invoke/call/execute(args)and the audit + return. The scanner runs scanToolResult at stage tool_result; on block the redacted detail object replaces the output before it reaches the agent loop.Each config gains:
scanToolResults?: boolean // default true, opt-out via false
toolResultInjectionThreshold?: number // local detection threshold, default 0.5
What didn't change in 0.15:
Drop-in upgrade. No public type breakage. Tests that mock tool returns may need
scanToolResults: falseto skip the helper.1,372 tests, 0 failures.
Description
Checklist
npm testpasses (all 945+ tests)npm run buildcompiles cleananytypes introducedNote
Medium Risk
Changes the runtime behavior of multiple tool adapters by altering tool return values (including serializing/redacting on block), which could affect downstream expectations and tests even though the API surface change is additive.
Overview
Adds default-on tool-result scanning to the LangChain, OpenAI Agents, Genkit, and LlamaIndex adapters by wrapping each tool’s
invoke/call/executeto run its returned value throughscanToolResult()at stagetool_result, returning a redacted{ blocked, reason, ruleId }object on block/approval so poisoned tool output never reaches the next LLM turn.Each adapter config gains
scanToolResults?: boolean(opt-out) andtoolResultInjectionThreshold?: number; OpenAI Agents also stringifies blocked objects for the Responses API, LlamaIndex coercesruleIdto a non-null string for JSONValue compatibility, and LangChain guards non-object tool inputs when passing args into result scanning. Version bumps to0.15.0and the changelog documents the new behavior and migration notes.Reviewed by Cursor Bugbot for commit 276e651. Bugbot is set up for automated code reviews on this repo. Configure here.