Skip to content

feat: qdrant generic client #3377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

greysonlalonde
Copy link
Contributor

@greysonlalonde greysonlalonde commented Aug 21, 2025

Implements Qdrant vector database client for RAG functionality.

Changes

  • Add QdrantClient class with sync/async collection and document operations
  • Add type definitions and utility functions
  • Add score normalization from [-1, 1] to [0, 1] range
  • Include BAAI/bge-small-en-v1.5 as default embedding
  • Extract shared constants and utilities

Copy link
Collaborator

@lorenzejay lorenzejay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but you can address later those new line end of files

- Add proper type for vectors_config using VectorParams
- Remove duplicate code in add_documents/aadd_documents
- Create reusable create_point_from_document utility
- Prefix all functions in utils.py with underscore
- Update all imports and function calls in client.py
- Use union types directly instead of type aliases in TypedDict
- Remove unused QuantizationConfig and InitFromType aliases
- Create custom exception for client type mismatches
- Replace generic TypeError with specific exception
- Update all tests to expect new exception type
- Add constants.py with DEFAULT_VECTOR_PARAMS
- Update utils.py to use constant instead of inline definition
- Add Final type annotation for immutability
- Change DEFAULT_VECTOR_PARAMS from 1536 to 384 dimensions
- Matches BAAI/bge-small-en-v1.5 embedding model output
- Standardize 384-dim as default
@greysonlalonde greysonlalonde marked this pull request as ready for review August 22, 2025 07:01
Comment on lines +113 to +117
if self.client.collection_exists(collection_name):
raise ValueError(f"Collection '{collection_name}' already exists")

params = _get_collection_params(kwargs)
self.client.create_collection(**params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this process may happen in multiple threads or processes in parallel, what do you think of skipping self.client.collection_exists and catching as an exception of self.client.create_collection?

Or even doesn't raise that at all, and swallow that exception and treat this as "get or create collection".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me dig into the qdrant docs a little more to be sure, but I think I'm aligned with you here + on the other comments

Comment on lines +153 to +157
if await self.client.collection_exists(collection_name):
raise ValueError(f"Collection '{collection_name}' already exists")

params = _get_collection_params(kwargs)
await self.client.create_collection(**params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same parallel execution comment.

Comment on lines +195 to +201
if self.client.collection_exists(collection_name):
return self.client.get_collection(collection_name)

params = _get_collection_params(kwargs)
self.client.create_collection(**params)

return self.client.get_collection(collection_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should catch exceptions here to prevent race conditions between self.client.collection_exists and self.client.get_collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants