-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix/Update Qdrant: set field_schema to KEYWORD & add encoding="utf-8" for file opening #1940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughWhitespace-only edits to two binary file open statements in eval preparation; explicit UTF-8 encoding added when opening Vega-Lite schema JSON in two chart pipeline initializers; Qdrant payload index configured using Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
wren-ai-service/src/pipelines/generation/chart_generation.py (1)
141-141
: UTF-8 read is good; tiny PEP8 nit on spacingStyle nit: add a space after the comma before encoding for readability.
-with open("src/pipelines/generation/utils/vega-lite-schema-v5.json", "r",encoding="utf-8") as f: +with open("src/pipelines/generation/utils/vega-lite-schema-v5.json", "r", encoding="utf-8") as f:wren-ai-service/src/pipelines/generation/chart_adjustment.py (1)
168-168
: UTF-8 read looks good; minor spacing nitSame spacing nit as the generation pipeline.
-with open("src/pipelines/generation/utils/vega-lite-schema-v5.json", "r",encoding="utf-8") as f: +with open("src/pipelines/generation/utils/vega-lite-schema-v5.json", "r", encoding="utf-8") as f:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
wren-ai-service/eval/preparation.py
(2 hunks)wren-ai-service/src/pipelines/generation/chart_adjustment.py
(1 hunks)wren-ai-service/src/pipelines/generation/chart_generation.py
(1 hunks)wren-ai-service/src/providers/document_store/qdrant.py
(2 hunks)
🔇 Additional comments (1)
wren-ai-service/src/providers/document_store/qdrant.py (1)
4-4
: Reuse existingrest
alias for PayloadSchemaType; dropmodels
importReplace
models.PayloadSchemaType.KEYWORD
withrest.PayloadSchemaType.KEYWORD
and remove thefrom qdrant_client.http import models
import (pinned qdrant-client 1.11.0 already supportsPayloadSchemaType.KEYWORD
).
Remove the encoding="utf-8" parameter from the open() calls at lines 92-93 and 213-214. Binary mode (i.e. using "rb") is mandatory for orjson.loads to function properly and to prevent a TypeError. An alternative would be to open the file in text mode if a string is used.
@richedperson1 it seems we already provide "keyword" as PayloadSchemaType? |
This PR introduces two improvements:
In the Qdrant provider, the creation of the payload index for the "project_id" field has been updated to explicitly set the field schema using models.PayloadSchemaType.KEYWORD. This change ensures that the data type is correctly interpreted by Qdrant, leading to more reliable indexing and search operations.
In the chart generation pipeline, the JSON file loading has been updated to enforce encoding="utf-8". This ensures consistent behavior when reading configuration files across various environments.
Both improvements are aimed at enhancing compatibility and reliability of the document
Summary by CodeRabbit