bug: ai-lakera-guard streaming action=alert ignores fail_open=false (fail-closed) on Lakera errors

### Current Behavior

With `ai-lakera-guard` configured for response scanning (`direction=output` or `both`) on **streaming** traffic with `action=alert`, the plugin discards the deny result of the response scan. When the Lakera Guard API errors, times out, or is unreachable while `fail_open=false`, the streamed response is allowed through to the client instead of being blocked.

In the response body filter, `moderate()` correctly returns a deny code/body on a Lakera error when `fail_open=false`. The non-streaming path and the streaming `block` path both honor that return. But the streaming `action=alert` branch calls `moderate_response(ctx, conf, text)` and throws away its return value, then returns nothing — so the fail-closed decision is silently lost.

This contradicts the plugin's own documented contract. The `action` schema description states it affects flagged verdicts only — *"Lakera API errors/timeouts stay governed by `fail_open` even in alert mode"* — and `fail_open` defaults to `false` (fail-closed).

Location (PR #13606), `apisix/plugins/ai-lakera-guard.lua`:
- `moderate()` error branch: `core.log.error(... "fail_open=false, blocking ...") ; return conf.deny_code, deny_message(...)`
- streaming `action=alert` branch: calls `moderate_response(ctx, conf, text)` and ignores the returned `(code, body)`.

### Expected Behavior

In streaming `action=alert`, a Lakera error/timeout with `fail_open=false` should **fail closed** (block/replace the response), exactly as the non-streaming and streaming `block` paths do. `action=alert` should only suppress *flagged* verdicts; it must not change error handling.

Caveat for the fix: alert mode streams chunks to the client in real time, so by end-of-stream the body has already been delivered and cannot be retracted. Failing closed in alert mode therefore requires buffering the stream when `fail_open=false`, or explicitly rejecting/rewriting this config combination and documenting it. Streaming-alert error/timeout tests should be added.

### Error Logs

On a Lakera error with `fail_open=false`, the plugin logs:

```
ai-lakera-guard: <err>; fail_open=false, blocking response
```

…yet the streamed response is still delivered in full. The log claims the response was blocked while the client receives the content — a silent fail-open (no crash).

### Steps to Reproduce

1. Run APISIX with PR #13606 applied.
2. Create a route with `ai-proxy` (or `ai-proxy-multi`) to an OpenAI-compatible upstream, plus `ai-lakera-guard` configured with `direction=output`, `action=alert`, `fail_open=false`.
3. Force the Lakera Guard endpoint to fail (point `endpoint` at an unreachable host, block egress, or force a timeout).
4. Send a streaming chat completion request (`"stream": true`).
5. Observe: the streamed assistant response is delivered in full even though the Lakera scan failed. Expected: a fail-closed (blocked) response.

### Environment

- APISIX version: `master` with PR #13606 (`ai-lakera-guard` response/output scanning).
- This is a code-path defect independent of OS / OpenResty / etcd versions; reproducible on any environment running that branch.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: ai-lakera-guard streaming action=alert ignores fail_open=false (fail-closed) on Lakera errors #13619

Current Behavior

Expected Behavior

Error Logs

Steps to Reproduce

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

bug: ai-lakera-guard streaming action=alert ignores fail_open=false (fail-closed) on Lakera errors #13619

Description

Current Behavior

Expected Behavior

Error Logs

Steps to Reproduce

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions