You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With ai-lakera-guard configured for response scanning (direction=output or both) on streaming traffic with action=alert, the plugin discards the deny result of the response scan. When the Lakera Guard API errors, times out, or is unreachable while fail_open=false, the streamed response is allowed through to the client instead of being blocked.
In the response body filter, moderate() correctly returns a deny code/body on a Lakera error when fail_open=false. The non-streaming path and the streaming block path both honor that return. But the streaming action=alert branch calls moderate_response(ctx, conf, text) and throws away its return value, then returns nothing — so the fail-closed decision is silently lost.
This contradicts the plugin's own documented contract. The action schema description states it affects flagged verdicts only — "Lakera API errors/timeouts stay governed by fail_open even in alert mode" — and fail_open defaults to false (fail-closed).
streaming action=alert branch: calls moderate_response(ctx, conf, text) and ignores the returned (code, body).
Expected Behavior
In streaming action=alert, a Lakera error/timeout with fail_open=false should fail closed (block/replace the response), exactly as the non-streaming and streaming block paths do. action=alert should only suppress flagged verdicts; it must not change error handling.
Caveat for the fix: alert mode streams chunks to the client in real time, so by end-of-stream the body has already been delivered and cannot be retracted. Failing closed in alert mode therefore requires buffering the stream when fail_open=false, or explicitly rejecting/rewriting this config combination and documenting it. Streaming-alert error/timeout tests should be added.
Error Logs
On a Lakera error with fail_open=false, the plugin logs:
…yet the streamed response is still delivered in full. The log claims the response was blocked while the client receives the content — a silent fail-open (no crash).
Create a route with ai-proxy (or ai-proxy-multi) to an OpenAI-compatible upstream, plus ai-lakera-guard configured with direction=output, action=alert, fail_open=false.
Force the Lakera Guard endpoint to fail (point endpoint at an unreachable host, block egress, or force a timeout).
Send a streaming chat completion request ("stream": true).
Observe: the streamed assistant response is delivered in full even though the Lakera scan failed. Expected: a fail-closed (blocked) response.
Current Behavior
With
ai-lakera-guardconfigured for response scanning (direction=outputorboth) on streaming traffic withaction=alert, the plugin discards the deny result of the response scan. When the Lakera Guard API errors, times out, or is unreachable whilefail_open=false, the streamed response is allowed through to the client instead of being blocked.In the response body filter,
moderate()correctly returns a deny code/body on a Lakera error whenfail_open=false. The non-streaming path and the streamingblockpath both honor that return. But the streamingaction=alertbranch callsmoderate_response(ctx, conf, text)and throws away its return value, then returns nothing — so the fail-closed decision is silently lost.This contradicts the plugin's own documented contract. The
actionschema description states it affects flagged verdicts only — "Lakera API errors/timeouts stay governed byfail_openeven in alert mode" — andfail_opendefaults tofalse(fail-closed).Location (PR #13606),
apisix/plugins/ai-lakera-guard.lua:moderate()error branch:core.log.error(... "fail_open=false, blocking ...") ; return conf.deny_code, deny_message(...)action=alertbranch: callsmoderate_response(ctx, conf, text)and ignores the returned(code, body).Expected Behavior
In streaming
action=alert, a Lakera error/timeout withfail_open=falseshould fail closed (block/replace the response), exactly as the non-streaming and streamingblockpaths do.action=alertshould only suppress flagged verdicts; it must not change error handling.Caveat for the fix: alert mode streams chunks to the client in real time, so by end-of-stream the body has already been delivered and cannot be retracted. Failing closed in alert mode therefore requires buffering the stream when
fail_open=false, or explicitly rejecting/rewriting this config combination and documenting it. Streaming-alert error/timeout tests should be added.Error Logs
On a Lakera error with
fail_open=false, the plugin logs:…yet the streamed response is still delivered in full. The log claims the response was blocked while the client receives the content — a silent fail-open (no crash).
Steps to Reproduce
ai-proxy(orai-proxy-multi) to an OpenAI-compatible upstream, plusai-lakera-guardconfigured withdirection=output,action=alert,fail_open=false.endpointat an unreachable host, block egress, or force a timeout)."stream": true).Environment
masterwith PR feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming) #13606 (ai-lakera-guardresponse/output scanning).