-
-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Contribution Terms
- I have reviewed the project’s Code of Conduct and contribution guidelines.
- I plan to implement this feature myself and submit a pull request.
Feature title
Support native OpenWebUI citations in azure_ai_foundry.py pipeline
Feature overview
Current Azure AI Search citation integration only provides markdown/HTML citation sections, but does not emit structured events or fields that OpenWebUI can use for native citation cards and UI. The proposal is to emit native OpenWebUI citation events and provide a response["openwebui_citations"] field, so the frontend can consume these to display source cards and previews, correlate inline tokens, and provide an improved UX. This would solve: open-webui/pipelines#229, #64
Target users: OpenWebUI users leveraging Azure AI Search / RAG with source/citation support.
Benefit: Enables full citation experience (cards, click sources, preview) instead of plain markdown.
Implementation details
Proposed approach:
- Add config to enable structured citation output (AZURE_AI_OPENWEBUI_CITATIONS, default: true).
- For streaming: In stream_processor_with_citations, as soon as citations are detected in SSE or in delta.context/message.context, normalize and emit native event dicts matching OpenWebUI's documented shape. Example:
# yield a readable event dict for the generator-style pipelines
yield {
"event": {
"type": "citation",
"data": {
"document": [citation.get('content',"")],
"metadata": [citation.get('metadata', {})],
"source": {
"name": citation.get('title') or citation.get('filepath') or citation.get('url','Unknown'),
"url": citation.get('url'),
},
"score": citation.get('score'),
"chunk_id": citation.get('chunk_id'),
"id": "doc1",
"token": "doc1",
},
}
}- For non-streaming: Attach normalized citations list to response["openwebui_citations"], e.g.:
{
"id": "doc1",
"token": "doc1",
"title": "Document title or filepath or url",
"url": "https://...",
"filepath": "/path/to/file",
"preview": "Snippet or content",
"chunk_id": "...",
"metadata": { ... },
"score": 0.123
}- Correlate citations with inline tokens (e.g. [doc1]) by ensuring id/token field alignment.
- Keep existing markdown/HTML fallback (_format_citation_section) for non-structured clients when AZURE_AI_ENHANCE_CITATIONS is true.
- See Return citations from pipelines open-webui/pipelines#229 (comment) for the event format example.
Implementation tasks:
- Add helpers: _extract_citations_from_response, _normalize_citation_for_openwebui, _emit_openwebui_citation_events.
- Patch stream_processor_with_citations to emit citation events as soon as citation data is seen (emit via event_emitter if available and/or yield event dicts for generator-style).
- Patch non-stream logic to attach response["openwebui_citations"].
- Add tests and a short README example.
Tasks and milestones
- Add AZURE_AI_OPENWEBUI_CITATIONS config option (default true)
- Helper functions for extraction/normalization/event emitting
- Patch stream logic: emit native event(s) as soon as possible
- Update docs with usage and example
Questions or areas for feedback
- Is event_emitter preferred (server push) or direct yield (generator-style) for streaming SSE events? (Both can be supported.)
- Should we emit both structured events and keep markdown fallback by default, or make structured output exclusive when enabled?
Additional context or references
- Feature/citation event discussion: Return citations from pipelines open-webui/pipelines#229
- Example format (see comment): Return citations from pipelines open-webui/pipelines#229 (comment)
- OpenWebUI event emitter documentation: https://docs.openwebui.com/features/plugin/development/events
- Bug context: When using Azure AI Search, openwebui does not show citations/sources #64
- Current implementation:
_format_citation_section()andstream_processor_with_citationsin pipelines/azure/azure_ai_foundry.py - Search for 'citation' to locate relevant areas in codebase
Metadata
Metadata
Labels
Projects
Status