Skip to content

ChatQnA Example with OpenAI-Compatible Endpoint #2091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

edlee123
Copy link
Contributor

Description

Allows ChatQnA to be used with thousands of OpenAI-like endpoints e.g. OpenRouter.ai, Hugging Face, Denvr, and improve the developer experience to use OPEA quickly even on low resource environments.

Key Changes Made:

  • Created ChatQnA/docker_compose/intel/cpu/xeon/README_endpoint_openai.md: instructions to spin up example.
  • Created ChatQnA/docker_compose/intel/cpu/xeon/compose_endpoint_openai.yaml: replaces vLLM with an opeai-like endpoint

Also:

  • Fixed align_generator function to properly detect and skip chunks where content is null in open-ai like endpoints. Previously it'd show the null json in the UI.
  • Added better error handling and debug logging for easier troubleshooting of endpoint issues.

Issues

N/A

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

N/A

Tests

OpenRouter.ai: anthropic/claude-3.7-sonnet
Denvr: meta-llama/Llama-3.1-70B-Instruct
Hugging Face Inference Endpoint: microsoft/phi-4

edlee123 and others added 30 commits June 24, 2025 18:08
…w null json. Also improved exception handling and logging

Signed-off-by: Ed Lee <[email protected]>
Integrate MultimodalQnA set_env to ut scripts.
Add README.md for UT scripts.

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
…nt (opea-project#1996)

Signed-off-by: Mustafa <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Ed Lee <[email protected]>
Signed-off-by: Yi Yao <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
Signed-off-by: ZePan110 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Ed Lee <[email protected]>
…archQnA and Translation (opea-project#2038)

update secrets token name for ProductivitySuite, RerankFinetuning, SearchQnA and Translation
Fix shellcheck issue

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
…rkflowExecAgent (opea-project#2039)

update secrets token name for InstructionTuning, MultimodalQnA and WorkflowExecAgent
Fix shellcheck issue

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ed Lee <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings June 24, 2025 23:34
Copy link

github-actions bot commented Jun 24, 2025

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces an OpenAI-compatible endpoint for ChatQnA, updates the deployment documentation, and includes improvements in error handling and logging.

  • Added new Docker Compose file (compose_endpoint_openai.yaml) to support OpenAI-like endpoints.
  • Updated README files for clearer deployment instructions and configuration details.
  • Fixed the align_generator function in chatqna.py to better handle and filter null content chunks.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
CodeGen/docker_compose/intel/cpu/xeon/README.md Updated docker compose command and environment variable documentation; note a markdown table formatting issue.
ChatQnA/docker_compose/intel/cpu/xeon/compose_endpoint_openai.yaml Added new compose file for OpenAI-compatible endpoint integration.
ChatQnA/docker_compose/intel/cpu/xeon/README_endpoint_openai.md New documentation with detailed instructions for deploying ChatQnA using the new endpoint.
ChatQnA/chatqna.py Improved logging and error handling in input/output alignment and generator functions.
Comments suppressed due to low confidence (1)

CodeGen/docker_compose/intel/cpu/xeon/README.md:111

  • The table row for LLM_ENDPOINT appears to be broken into two columns due to an unintended pipe character. Please merge the content into a single cell to ensure the URL displays correctly.
| `LLM_ENDPOINT`                          | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`.             | `http://codegen-vllm                           | tgi-server:9000/v1/chat/completions` |

@edlee123 edlee123 requested a review from letonghan July 2, 2025 05:09
@edlee123
Copy link
Contributor Author

edlee123 commented Jul 2, 2025

Hi @yao531441 @letonghan if either of you can, I'm looking for one more reviewer please :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.