fix(ai-protocols): flatten structured message content in the protocol layer by nic-6443 · Pull Request #13634 · apache/apisix

nic-6443 · 2026-06-30T10:21:07Z

Description

ai-prompt-guard returns 500 when a chat message's content is a structured array instead of a plain string.

OpenAI Chat Completions allows messages[].content to be either a string or an array of typed parts, e.g.:

{ "role": "user", "content": [ { "type": "text", "text": "hello" } ] }

The root cause is in the protocol layer: openai-chat.get_messages() returned body.messages verbatim, so a table-valued content leaked to consumers. ai-prompt-guard then concatenates message content with table.concat, which raises:

ai-prompt-guard.lua: invalid value (table) at index N in table for 'concat'

The canonical {role, content} contract is that get_messages returns content already flattened to a plain string. This PR makes every adapter honor that contract and keeps the flattening in exactly one place per adapter:

openai-chat / anthropic-messages: get_messages now reuses the adapter's existing append_message_text helper instead of re-implementing the flatten loop.
bedrock-converse: adds append_block_texts, replacing six copies of the content-block text extraction across extract_response_text, extract_request_content, extract_user_content and get_messages.
openai-responses: adds append_item_text shared by extract_request_content, extract_user_content and get_messages. This also fixes get_messages silently dropping structured (array) input content parts — the same class of bug, previously masked because it dropped the content instead of crashing.

Non-text parts (e.g. image_url) are dropped, consistent across all adapters. Every get_messages consumer (ai-prompt-guard, ai-lakera-guard, ai-cache) then receives flattened text without re-implementing the extraction; ai-lakera-guard.normalize_messages is reduced to filtering empty turns.

Regression tests cover structured content in conversation history, the latest user message, mixed text/non-text parts (openai-chat), and structured input parts (Responses API); they fail before the fix and pass after. The ai-prompt-guard, ai-lakera-guard, and ai-cache suites are green locally.

Which issue(s) this PR fixes:

Fixes #

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible

OpenAI Chat allows messages[].content to be either a plain string or an array of typed parts (e.g. [{type="text", text="..."}]). The plugin collected msg.content as-is and then called table.concat, raising "invalid value (table) ... for 'concat'" and returning 500 whenever an inspected message carried array content. Flatten each message's content (string, or the text parts of an array) before concatenation, matching how the protocol adapters already extract text. Add regression tests for array content in conversation history, in the latest user message, and mixed text/non-text parts.

Copilot

Pull request overview

This PR fixes a Lua runtime crash in the ai-prompt-guard plugin when inspecting OpenAI Chat-style messages that use structured (array) messages[].content, by flattening text parts before concatenation.

Changes:

Add a helper to flatten message content (string or typed-parts array) into plain text prior to table.concat.
Update prompt aggregation to use the new flattening helper.
Add regression tests covering structured content in history and the latest user message (including mixed text + non-text parts).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`apisix/plugins/ai-prompt-guard.lua`	Flattens structured message content into text before concatenation to prevent `table.concat` runtime errors.
`t/plugin/ai-prompt-guard.t`	Adds tests ensuring structured `content` is scanned/denied correctly without crashing under different match modes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add a case where the deny word lives only in a non-text part (image_url) to lock in that only text parts are inspected. Also declare `local type = type` so the content-flatten helper passes the localize-globals style check.

…_messages Move the OpenAI Chat structured-content handling out of ai-prompt-guard and into openai-chat.get_messages, so it returns canonical string content like every other adapter (anthropic, bedrock, responses, embeddings) already does. get_messages previously returned body.messages verbatim, so a message whose content is an array of typed parts (e.g. [{type="text", text="..."}]) leaked a table to consumers. ai-prompt-guard then hit "invalid value (table) ... for 'concat'" (500). Flattening in the protocol layer keeps protocol details out of the plugins: ai-prompt-guard, ai-lakera-guard and ai-cache all get flattened text without duplicating the logic.

openai-chat.get_messages now flattens content like the other adapters, so no adapter returns body.messages verbatim anymore. normalize_messages stays as idempotent defense-in-depth; reword the comment to match.

…lize_messages openai-chat.get_messages now flattens content to a string like every other adapter, so normalize_messages no longer needs its own text-part extraction (it only ever runs on get_messages output). Reduce it to the Lakera-specific filtering: keep role-tagged messages with non-empty string content.

…oss adapters Make the protocol layer the single place that flattens structured message content into plain strings, removing the duplicated flatten loops that the get_messages fix would otherwise leave scattered across adapters: - openai-chat / anthropic-messages: get_messages reuses the existing append_message_text helper instead of re-implementing the loop - bedrock-converse: add append_block_texts, replacing six copies of the content-block text extraction - openai-responses: add append_item_text shared by extract_request_content, extract_user_content and get_messages; this also fixes get_messages silently dropping structured (array) input content parts Add an ai-prompt-guard test for Responses API structured content.

… contract get_messages always returns a table of {role, content} tables it constructs itself, with content already flattened to a string, so normalize_messages keeps only the two live filters: turns without a role (adapters pass the client role through verbatim) and empty content that Lakera /v2/guard has nothing to scan. The re-copy into a fresh table is also unnecessary.

Copilot AI review requested due to automatic review settings June 30, 2026 10:21

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Jun 30, 2026

Copilot started reviewing on behalf of nic-6443 June 30, 2026 10:21 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

Comment thread apisix/plugins/ai-prompt-guard.lua Outdated

nic-6443 added 2 commits June 30, 2026 18:29

nic-6443 changed the title ~~fix(ai-prompt-guard): handle structured message content~~ fix(ai-protocols): flatten structured content in openai-chat get_messages Jul 1, 2026

nic-6443 added 3 commits July 1, 2026 10:36

docs(ai-lakera-guard): update stale normalize_messages comment

1b86ffb

openai-chat.get_messages now flattens content like the other adapters, so no adapter returns body.messages verbatim anymore. normalize_messages stays as idempotent defense-in-depth; reword the comment to match.

nic-6443 changed the title ~~fix(ai-protocols): flatten structured content in openai-chat get_messages~~ fix(ai-protocols): flatten structured message content in the protocol layer Jul 1, 2026

membphis approved these changes Jul 2, 2026

View reviewed changes

AlinsRan approved these changes Jul 2, 2026

View reviewed changes

shreemaan-abhishek approved these changes Jul 2, 2026

View reviewed changes

nic-6443 merged commit 11dfb18 into apache:master Jul 2, 2026
19 checks passed

nic-6443 deleted the fix/ai-prompt-guard-structured-content branch July 2, 2026 09:32

AlinsRan mentioned this pull request Jul 3, 2026

fix(ai-cache): bypass caching for prompts carrying non-text content #13652

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai-protocols): flatten structured message content in the protocol layer#13634

fix(ai-protocols): flatten structured message content in the protocol layer#13634
nic-6443 merged 7 commits into
apache:masterfrom
nic-6443:fix/ai-prompt-guard-structured-content

nic-6443 commented Jun 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

nic-6443 commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Which issue(s) this PR fixes:

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nic-6443 commented Jun 30, 2026 •

edited

Loading