-
Notifications
You must be signed in to change notification settings - Fork 4.8k
feat: qdrant generic client #3377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Move all imports to top of file - Remove unnecessary inline comments and blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit but you can address later those new line end of files
- Add proper type for vectors_config using VectorParams - Remove duplicate code in add_documents/aadd_documents - Create reusable create_point_from_document utility
- Prefix all functions in utils.py with underscore - Update all imports and function calls in client.py
- Use union types directly instead of type aliases in TypedDict - Remove unused QuantizationConfig and InitFromType aliases
- Create custom exception for client type mismatches - Replace generic TypeError with specific exception - Update all tests to expect new exception type
- Add constants.py with DEFAULT_VECTOR_PARAMS - Update utils.py to use constant instead of inline definition - Add Final type annotation for immutability
- Change DEFAULT_VECTOR_PARAMS from 1536 to 384 dimensions - Matches BAAI/bge-small-en-v1.5 embedding model output - Standardize 384-dim as default
if self.client.collection_exists(collection_name): | ||
raise ValueError(f"Collection '{collection_name}' already exists") | ||
|
||
params = _get_collection_params(kwargs) | ||
self.client.create_collection(**params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this process may happen in multiple threads or processes in parallel, what do you think of skipping self.client.collection_exists
and catching as an exception of self.client.create_collection
?
Or even doesn't raise that at all, and swallow that exception and treat this as "get or create collection".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me dig into the qdrant docs a little more to be sure, but I think I'm aligned with you here + on the other comments
if await self.client.collection_exists(collection_name): | ||
raise ValueError(f"Collection '{collection_name}' already exists") | ||
|
||
params = _get_collection_params(kwargs) | ||
await self.client.create_collection(**params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same parallel execution comment.
if self.client.collection_exists(collection_name): | ||
return self.client.get_collection(collection_name) | ||
|
||
params = _get_collection_params(kwargs) | ||
self.client.create_collection(**params) | ||
|
||
return self.client.get_collection(collection_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should catch exceptions here to prevent race conditions between self.client.collection_exists
and self.client.get_collection
.
Implements Qdrant vector database client for RAG functionality.
Changes