Skip to content

Support native OpenWebUI citations and events in azure_ai_foundry.py pipeline #100

@owndev

Description

@owndev

Contribution Terms

  • I have reviewed the project’s Code of Conduct and contribution guidelines.
  • I plan to implement this feature myself and submit a pull request.

Feature title

Support native OpenWebUI citations in azure_ai_foundry.py pipeline

Feature overview

Current Azure AI Search citation integration only provides markdown/HTML citation sections, but does not emit structured events or fields that OpenWebUI can use for native citation cards and UI. The proposal is to emit native OpenWebUI citation events and provide a response["openwebui_citations"] field, so the frontend can consume these to display source cards and previews, correlate inline tokens, and provide an improved UX. This would solve: open-webui/pipelines#229, #64

Target users: OpenWebUI users leveraging Azure AI Search / RAG with source/citation support.
Benefit: Enables full citation experience (cards, click sources, preview) instead of plain markdown.

Implementation details

Proposed approach:

  • Add config to enable structured citation output (AZURE_AI_OPENWEBUI_CITATIONS, default: true).
  • For streaming: In stream_processor_with_citations, as soon as citations are detected in SSE or in delta.context/message.context, normalize and emit native event dicts matching OpenWebUI's documented shape. Example:
# yield a readable event dict for the generator-style pipelines
yield {
    "event": {
        "type": "citation",
        "data": {
            "document": [citation.get('content',"")],
            "metadata": [citation.get('metadata', {})],
            "source": {
                "name": citation.get('title') or citation.get('filepath') or citation.get('url','Unknown'),
                "url": citation.get('url'),
            },
            "score": citation.get('score'),
            "chunk_id": citation.get('chunk_id'),
            "id": "doc1",
            "token": "doc1",
        },
    }
}
  • For non-streaming: Attach normalized citations list to response["openwebui_citations"], e.g.:
{
  "id": "doc1",
  "token": "doc1",
  "title": "Document title or filepath or url",
  "url": "https://...",
  "filepath": "/path/to/file",
  "preview": "Snippet or content",
  "chunk_id": "...",
  "metadata": { ... },
  "score": 0.123
}

Implementation tasks:

  • Add helpers: _extract_citations_from_response, _normalize_citation_for_openwebui, _emit_openwebui_citation_events.
  • Patch stream_processor_with_citations to emit citation events as soon as citation data is seen (emit via event_emitter if available and/or yield event dicts for generator-style).
  • Patch non-stream logic to attach response["openwebui_citations"].
  • Add tests and a short README example.

Tasks and milestones

  • Add AZURE_AI_OPENWEBUI_CITATIONS config option (default true)
  • Helper functions for extraction/normalization/event emitting
  • Patch stream logic: emit native event(s) as soon as possible
  • Update docs with usage and example

Questions or areas for feedback

  • Is event_emitter preferred (server push) or direct yield (generator-style) for streaming SSE events? (Both can be supported.)
  • Should we emit both structured events and keep markdown fallback by default, or make structured output exclusive when enabled?

Additional context or references

Metadata

Metadata

Assignees

Labels

enhancementFeature request, performance, process improvement, cost optimizationfeatureNew featuresintegrationIntegration-related issues

Projects

Status

In review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions