-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
Summary
This RFC proposes adding support for the latest pgvector features into the vecs Python client. These include new vector types (halfvec, sparsevec), enhanced indexing capabilities, and additional vector functions (binary_quantize, hamming_distance, etc.).
Rationale
Recent advancements in pgvector—such as new vector types, improved indexing, and new functions—are currently missing from the vecs client. Integrating these features will ensure feature parity, enabling efficient storage, diverse similarity metrics, and extended vector operations, which will support a broader range of use cases.
Design
Proposed Additions
-
Vector Types:
halfvec: Half precision vectors for reduced storage and faster operations.sparsevec: Sparse vectors that store only non-zero values to optimize memory usage.
-
Indexing Enhancements:
bitType Indexing: Add support for indexing vectors stored asbittype.- L1 Distance with HNSW: Add support for using L1 distance with HNSW indexing for similarity searches.
-
New Functions:
binary_quantize: Converts a vector into a binary form based on a threshold.hamming_distance: Calculates Hamming distance for binary vectors.jaccard_distance: Computes the Jaccard distance between vectors.l2_normalize: Normalizes vectors to unit length.subvector: Extracts a subvector from the main vector.
Examples
For instance:
Creating a halfvec vector:
from vecs import halfvec
vec = halfvec([1.0, 2.0, 3.0])sobir-git
Metadata
Metadata
Assignees
Labels
No labels