refactor(ai-aws-content-moderation): moderate decoded LLM content in access phase#13647
Conversation
…access phase Restructure the plugin to match ai-aliyun-content-moderation: run in the access phase after ai-proxy, extract the decoded LLM prompt content via the detected client protocol instead of sending the raw request body to Comprehend, guard on ctx.picked_ai_instance via fail_mode, and return a provider-compatible deny response. Adds check_request, deny_code and deny_message options.
1030 collided with ai-rate-limiting; move to 1031 (still below ai-proxy) and update the priority-ordered plugin list in t/admin/plugins.t.
There was a problem hiding this comment.
Pull request overview
This PR refactors the ai-aws-content-moderation plugin to run after ai-proxy in the access phase, so it can moderate the decoded, protocol-normalized LLM prompt content (instead of scoring the raw JSON envelope). It also updates the deny behavior to return protocol/provider-compatible responses, and aligns tests/docs with the new usage model (co-deployed with ai-proxy/ai-proxy-multi).
Changes:
- Move moderation logic to
access(priority1050→1031) and extract LLM-visible text viaai-protocols. - Add request moderation controls and provider-compatible deny responses (
check_request,deny_code,deny_message). - Update tests and documentation to use
ai-proxyand validate the refactored behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| apisix/plugins/ai-aws-content-moderation.lua | Refactors plugin to access, extracts decoded prompt content via ai-protocols, and builds provider-compatible deny bodies. |
| t/plugin/ai-aws-content-moderation.t | Updates core moderation tests to run with ai-proxy and JSON LLM chat bodies; adjusts assertions for new deny response format. |
| t/plugin/ai-aws-content-moderation2.t | Updates “Comprehend unreachable” test to run with ai-proxy and JSON chat requests. |
| t/plugin/ai-aws-content-moderation-secrets.t | Updates secret-resolution tests to run with ai-proxy and JSON chat requests; keeps SigV4 credential propagation validation. |
| t/admin/plugins.t | Reorders plugin listing to reflect ai-proxy adjacency/dependency expectations. |
| docs/en/latest/plugins/ai-aws-content-moderation.md | Documents protocol-aware extraction, ai-proxy co-usage, and provider-compatible deny responses; adds new config options. |
| docs/zh/latest/plugins/ai-aws-content-moderation.md | Same as EN docs: protocol-aware behavior, ai-proxy co-usage, deny response format, and new config options. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
membphis
left a comment
There was a problem hiding this comment.
[P1] Legacy configurations without ai-proxy now fail open
Before this PR, ai-aws-content-moderation ran in the rewrite phase and moderated the raw request body for routes that used only this plugin. After the refactor, access() returns through binding.on_unsupported when ctx.picked_ai_instance is absent; with the existing default fail_mode=skip, the request is passed through unchecked. The new test also asserts that a toxic request without ai-proxy returns 200.
This is a content-moderation policy regression for existing deployments that already use this plugin without ai-proxy or ai-proxy-multi. Please keep a legacy raw-body moderation fallback, or make this migration fail closed / explicit opt-in with a documented compatibility path.
membphis
left a comment
There was a problem hiding this comment.
Approving this PR. For the legacy configurations without ai-proxy case, I recommend handling the compatibility and fail-open behavior in a separate follow-up PR, with an explicit migration path for standalone ai-aws-content-moderation usage.
Description
Refactor
ai-aws-content-moderationto be structured like its siblingai-aliyun-content-moderation.Previously the plugin ran in the
rewritephase (beforeai-proxy) and sent the raw HTTP request body to AWS Comprehend. As a result Comprehend scored the undecoded JSON envelope (e.g. the literal escaped string"content":"toxic"and{"model":...,"messages":[...]}), while the upstream LLM acts on the decoded prompt, so the two saw different text.What changed
rewritephase to theaccessphase (priority1050->1031, belowai-proxy's1040), so it runs afterai-proxyand can reusectx.ai_client_protocolandctx.picked_ai_instance.ai-protocols), sending only the normalized, decoded content to Comprehend. This is what the siblingai-aliyun-content-moderationplugin does.ctx.picked_ai_instanceviafail_mode, so the plugin reports clearly when it is used withoutai-proxy/ai-proxy-multi.ai-protocols) instead of a raw text body, so AI clients are not broken. Addscheck_request,deny_code(default200) anddeny_messageoptions.The AWS Comprehend decision model (
moderation_categoriesper-category thresholds and the overallmoderation_threshold) is unchanged.Behavior change
The plugin now must be used together with
ai-proxy/ai-proxy-multi, consistent withai-aliyun-content-moderationand with how the plugin is documented. A request that does not pass throughai-proxyis governed byfail_mode(defaultskip, i.e. passes through unchecked; setfail_mode: errorto reject). This supersedes #13528.Which issue(s) this PR fixes:
Fixes #
Checklist