-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Bug report
- I confirm this is a bug with Supabase, not with my own application.
- I confirm I have searched the Docs, GitHub Discussions, and Discord.
Describe the bug
I have created a table on my GCP hosted Postgres with pgvector enabled.
I have upserted some documents with an adapter and also indexed my table with HNSW.
Then I try to run my service which searches these records using the same adapter and ASSUMING the indexing I did in the DB population persists with vecs.
However the query does not work I get this error back from my DB:
sqlalchemy.exc.DatabaseError: (pg8000.exceptions.DatabaseError) {'S': 'ERROR', 'V': 'ERROR', 'C': '42601', 'M': 'syntax error at or near "$1"', 'P': '28', 'F': 'scan.l', 'L': '1240', 'R': 'scanner_yyerror'}
[SQL: set local ivfflat.probes = %s::INTEGER]
[parameters: (10,)]
I looked in the query method and it seems to always execute this code:
with self.client.Session() as sess:
with sess.begin():
# index ignored if greater than n_lists
sess.execute(
text("set local ivfflat.probes = :probes").bindparams(probes=probes)
)
if self.client._supports_hnsw():
sess.execute(
text("set local hnsw.ef_search = :ef_search").bindparams(
ef_search=ef_search
)
)
which seems to be the problem cause I dont have that indexing setup why does it always execute this?
Can this be avoided somehow or is this a bug?
Shouldnt it only use the indexing method that we have previously set up, or specify which one we are using instead of running this blindly?
Update
I think the error comes actually from the syntax in the SQL commands here:
with self.client.Session() as sess:
with sess.begin():
# index ignored if greater than n_lists
sess.execute(
text("set local ivfflat.probes = :probes").bindparams(probes=probes)
)
if self.client._supports_hnsw():
sess.execute(
text("set local hnsw.ef_search = :ef_search").bindparams(
ef_search=ef_search
)
)
if len(cols) == 1:
return [str(x) for x in sess.scalars(stmt).fetchall()]
return sess.execute(stmt).fetchall() or []
inside the qyery method of the Collection class.
This syntax is not accepted by my db at least this version Postgres v8
To Reproduce
Create a collection and upserted some documents in a Postgres DB with an adapter and also index the tale with HNSW like this:
docs = vx.get_or_create_collection(
name="my_embeddings",
adapter=embedding_adapter,
)
docs.upsert(records=records)
docs.create_index(
method=IndexMethod.hnsw,
measure=IndexMeasure.cosine_distance,
index_arguments=IndexArgsHNSW(m=16, ef_construction=64),
)
Then in a separate script re-access the collection and query it:
collection = vx.get_or_create_collection(
name="my_embeddings",
adapter=embedding_adapter,
)
results = collection.query(
data=query,
limit=top_k,
ef_search=200, # HNSW parameter
skip_adapter=False, # use adapter to convert text -> vector
include_metadata=True,
include_value=True,
)
using the same adapter.
You should get this error back from my DB if the ivfflat has not be set as indexing method :
sqlalchemy.exc.DatabaseError: (pg8000.exceptions.DatabaseError) {'S': 'ERROR', 'V': 'ERROR', 'C': '42601', 'M': 'syntax error at or near "$1"', 'P': '28', 'F': 'scan.l', 'L': '1240', 'R': 'scanner_yyerror'}
[SQL: set local ivfflat.probes = %s::INTEGER]
[parameters: (10,)]
Update
When I monkey patched the method like this:
with self.client.Session() as sess:
with sess.begin():
# index ignored if greater than n_lists
sess.execute(text(f"set local ivfflat.probes = {probes}"))
if self.client._supports_hnsw():
sess.execute(text(f"set local hnsw.ef_search = {ef_search}"))
if len(cols) == 1:
return [str(x) for x in sess.scalars(stmt).fetchall()]
return sess.execute(stmt).fetchall() or []
it worked it should be an easy fix
Expected behavior
Just for the query to work with my indexing and return the results. My modified code fixes the issue.
Screenshots
If applicable, add screenshots to help explain your problem.
System information
- OS: WSL
- Version of supabase-js: vecs=0.4.5
Additional context
Add any other context about the problem here.