fix(#496): stop str()-coercing multimodal content in OTel trace + red-team refusal detection by drewdrewthis · Pull Request #546 · langwatch/scenario

drewdrewthis · 2026-05-25T06:54:50Z

Summary

Two sites coerced multimodal message content to str(list), producing Python repr instead of structured data or useful text.

Site A — scenario_executor._broadcast_message (OTel trace input/output):

Before: str(message["content"]) → "[{'type': 'text', 'text': '...'}, {'type': 'audio', ...}]" in LangWatch trace
After: passes list content directly; coerces to str only for plain-text messages

Site B — red_team_agent._get_last_assistant_content / _get_last_user_content (refusal detection):

Before: str(content) → Python repr → substring match against hard/soft refusal patterns silently fails for voice agents
After: new _extract_text() helper concatenates text parts from multimodal lists; refusal detection works correctly for voice replies

Test plan

8 unit tests covering both sites: plain string passthrough, multimodal extraction, audio-only (empty), refusal classification through multimodal content
All pass locally: uv run pytest tests/test_red_team_agent.py::TestExtractText -v

Closes #496

🤖 Generated with Claude Code

…-team refusal detection [grinder] Site A (scenario_executor._broadcast_message): pass structured content (list) directly to _trace.update instead of converting with str(), which mangled multimodal voice messages to Python repr. Site B (red_team_agent): add _extract_text() helper that concatenates the 'text' fields of all text-type content parts. _get_last_assistant_content / _get_last_user_content now call _extract_text() instead of str(), so _detect_refusal correctly classifies hard/soft refusals in voice scenarios. Closes #496 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…multimodal content Pass a plain str to LangWatch trace.update() instead of wrapping in a dict (not a valid SpanInputOutput type) or calling str() on a list (produces unreadable Python repr). The new _extract_text_content helper joins text parts from structured content lists. Fixes Pyright errors at scenario_executor.py:319,327. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

drewdrewthis · 2026-05-25T07:08:13Z

[grinder] READY for human review

CI: green (zero failing, zero pending)
Threads: zero unresolved

Verified by:
`command gh pr checks 546` → all pass: test (3.12) pass 7m9s, python-complete pass, javascript-complete pass, Analyze (python) pass
Latest run: https://github.com/langwatch/scenario/actions/runs/26387868007/job/77670395908

…n bump) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…o_executor) AC3 requires tests for both str()-coercion sites. Site B (RedTeamAgent._extract_text) was already covered; this adds TestExtractTextContent for the module-level helper in scenario_executor.py — plain-string passthrough, multimodal extraction, audio-only (empty), and non-list fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

drewdrewthis · 2026-05-25T10:28:46Z

[grinder] Re-opening for AC gap: /review found that AC3 ("Unit tests covering each site with a multimodal-list input") was only half-satisfied. TestExtractText covered Site B (RedTeamAgent._extract_text) but Site A (_extract_text_content in scenario_executor.py) had zero test coverage. Added TestExtractTextContent class (4 tests) in commit 1b60c44. Waiting for CI to re-verify before re-marking pr-ready.

drewdrewthis · 2026-06-04T17:11:30Z

[grinder] READY for human review

CI: green (zero failing, zero pending — all 17 checks pass/skip)
ACs: met — fixes str()-coercion of multimodal content in OTel trace broadcast and red-team refusal detection; zero review threads
Threads: zero unresolved, zero outdated

Verified by:
`gh pr checks 546` → all 17 checks pass/skip, zero pending/failing
`gh api graphql reviewThreads` → `nodes: []` (zero threads)

…cion-multimodal # Conflicts: # python/uv.lock

github-actions · 2026-06-11T09:29:31Z

Automated low-risk assessment

This PR was evaluated against the repository's Low-Risk Pull Requests procedure and does not qualify as low risk.

The PR changes runtime behavior (how multimodal message content is extracted and passed to telemetry) and alters refusal-detection logic in red_team_agent, which affects security-relevant classification and integration with tracing. These are not limited to UI/docs/tests and touch logic and telemetry integration, so they do not meet the low-risk criteria. If unsure, this should get a normal review rather than automatic low-risk labeling.

This PR requires a manual review before merging.

drewdrewthis added the grinding Grinder is actively managing this PR label May 25, 2026

drewdrewthis added pr-ready and removed grinding Grinder is actively managing this PR labels May 25, 2026

chore(deps): sync uv.lock to reflect v0.7.27 (follows fix/#496 versio…

084fc6e

…n bump) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

drewdrewthis added grinding Grinder is actively managing this PR and removed pr-ready labels May 25, 2026

drewdrewthis added pr-ready and removed grinding Grinder is actively managing this PR labels Jun 4, 2026

Merge remote-tracking branch 'origin/main' into issue496/fix-str-coer…

a062d87

…cion-multimodal # Conflicts: # python/uv.lock

drewdrewthis requested a review from rogeriochaves June 11, 2026 09:26

drewdrewthis added the slack-requested Slack PR review request posted label Jun 11, 2026

rogeriochaves approved these changes Jun 11, 2026

View reviewed changes

drewdrewthis merged commit 83842e1 into main Jun 11, 2026
21 checks passed

drewdrewthis deleted the issue496/fix-str-coercion-multimodal branch June 11, 2026 09:33

rogeriochaves mentioned this pull request Jun 11, 2026

chore(main): release python 0.7.31 #656

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#496): stop str()-coercing multimodal content in OTel trace + red-team refusal detection#546

fix(#496): stop str()-coercing multimodal content in OTel trace + red-team refusal detection#546
drewdrewthis merged 5 commits into
mainfrom
issue496/fix-str-coercion-multimodal

drewdrewthis commented May 25, 2026

Uh oh!

drewdrewthis commented May 25, 2026

Uh oh!

drewdrewthis commented May 25, 2026

Uh oh!

drewdrewthis commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

drewdrewthis commented May 25, 2026

Summary

Test plan

Uh oh!

drewdrewthis commented May 25, 2026

Uh oh!

drewdrewthis commented May 25, 2026

Uh oh!

drewdrewthis commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants