zilliztech
diff --git a/‎.github/workflows/pull_request.yml
Lines changed: 1 addition & 0 deletions b/‎.github/workflows/pull_request.yml
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md
Lines changed: 13 additions & 30 deletions b/‎README.md
Lines changed: 13 additions & 30 deletions
diff --git a/‎fig/homepage/bar-chart.png
79.3 KB b/‎fig/homepage/bar-chart.png
79.3 KB
diff --git a/‎fig/homepage/concurrent.png
202 KB b/‎fig/homepage/concurrent.png
202 KB
diff --git a/‎fig/homepage/custom.png
73.8 KB b/‎fig/homepage/custom.png
73.8 KB
diff --git a/‎fig/homepage/label_filter.png
120 KB b/‎fig/homepage/label_filter.png
120 KB
diff --git a/‎fig/homepage/qp$.png
72 KB b/‎fig/homepage/qp$.png
72 KB
diff --git a/‎fig/homepage/run_test.png
545 KB b/‎fig/homepage/run_test.png
545 KB
diff --git a/‎fig/homepage/streaming.png
42.7 KB b/‎fig/homepage/streaming.png
42.7 KB
diff --git a/‎fig/homepage/table.png
168 KB b/‎fig/homepage/table.png
168 KB
diff --git a/‎fig/run_test_select_case.png
250 KB b/‎fig/run_test_select_case.png
250 KB
diff --git a/‎fig/run_test_select_db.png
249 KB b/‎fig/run_test_select_db.png
249 KB
diff --git a/‎fig/run_test_submit.png
48.9 KB b/‎fig/run_test_submit.png
48.9 KB
diff --git a/‎pyproject.toml
Lines changed: 1 addition & 1 deletion b/‎pyproject.toml
Lines changed: 1 addition & 1 deletion
diff --git a/‎vectordb_bench/__init__.py
Lines changed: 14 additions & 27 deletions b/‎vectordb_bench/__init__.py
Lines changed: 14 additions & 27 deletions
diff --git a/‎vectordb_bench/backend/assembler.py
Lines changed: 19 additions & 6 deletions b/‎vectordb_bench/backend/assembler.py
Lines changed: 19 additions & 6 deletions
@@ -4,6 +4,7 @@ on:
   pull_request:
     branches:
       - main
+      - vdbbench_*
 
 jobs:
   build:
 
@@ -426,52 +426,35 @@ The standard benchmark results displayed here include all 15 cases that we curre
 
 All standard benchmark results are generated by a client running on an 8 core, 32 GB host, which is located in the same region as the server being tested. The client host is equipped with an `Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz` processor. Also all the servers for the open-source systems tested in our benchmarks run on hosts with the same type of processor.
 ### Run Test Page
-![image](https://github.com/zilliztech/VectorDBBench/assets/105927039/f3135a29-8f12-4aac-bbb3-f2f55e2a2ff0)
-This is the page to run a test:
 1. Initially, you select the systems to be tested - multiple selections are allowed. Once selected, corresponding forms will pop up to gather necessary information for using the chosen databases. The db_label is used to differentiate different instances of the same system. We recommend filling in the host size or instance type here (as we do in our standard results).
 2. The next step is to select the test cases you want to perform. You can select multiple cases at once, and a form to collect corresponding parameters will appear.
 3. Finally, you'll need to provide a task label to distinguish different test results. Using the same label for different tests will result in the previous results being overwritten.
 Now we can only run one task at the same time. 
+![image](fig/run_test_select_db.png)
+![image](fig/run_test_select_case.png)
+![image](fig/run_test_submit.png)
+
 
 ## Module
 ### Code Structure
 ![image](https://github.com/zilliztech/VectorDBBench/assets/105927039/8c06512e-5419-4381-b084-9c93aed59639)
 ### Client
-Our client module is designed with flexibility and extensibility in mind, aiming to integrate APIs from different systems seamlessly. As of now, it supports Milvus, Zilliz Cloud, Elastic Search, Pinecone, Qdrant Cloud, Weaviate Cloud, PgVector, Redis, and Chroma. Stay tuned for more options, as we are consistently working on extending our reach to other systems.
+Our client module is designed with flexibility and extensibility in mind, aiming to integrate APIs from different systems seamlessly. As of now, it supports Milvus, Zilliz Cloud, Elastic Search, Pinecone, Qdrant Cloud, Weaviate Cloud, PgVector, Redis, Chroma, etc. Stay tuned for more options, as we are consistently working on extending our reach to other systems.
 ### Benchmark Cases
-We've developed an array of 15 comprehensive benchmark cases to test vector databases' various capabilities, each designed to give you a different piece of the puzzle. These cases are categorized into three main types:
+We've developed lots of comprehensive benchmark cases to test vector databases' various capabilities, each designed to give you a different piece of the puzzle. These cases are categorized into four main types:
 #### Capacity Case
 - **Large Dim:** Tests the database's loading capacity by inserting large-dimension vectors (GIST 100K vectors, 960 dimensions) until fully loaded. The final number of inserted vectors is reported.
 - **Small Dim:** Similar to the Large Dim case but uses small-dimension vectors (SIFT 500K vectors, 128 dimensions).
 #### Search Performance Case
 - **XLarge Dataset:** Measures search performance with a massive dataset (LAION 100M vectors, 768 dimensions) at varying parallel levels. The results include index building time, recall, latency, and maximum QPS.
-- **Large Dataset:** Similar to the XLarge Dataset case, but uses a slightly smaller dataset (10M-768dim, 5M-1536dim).
-- **Medium Dataset:** A case using a medium dataset (1M-768dim, 500K-1536dim).
+- **Large Dataset:** Similar to the XLarge Dataset case, but uses a slightly smaller dataset (10M-1024dim, 10M-768dim, 5M-1536dim).
+- **Medium Dataset:** A case using a medium dataset (1M-1024dim, 1M-768dim, 500K-1536dim).
+- **Small Dataset:** For development (100K-768dim, 50K-1536dim).
 #### Filtering Search Performance Case
-- **Large Dataset, Low Filtering Rate:** Evaluates search performance with a large dataset (10M-768dim, 5M-1536dim) under a low filtering rate (1% vectors) at different parallel levels.
-- **Medium Dataset, Low Filtering Rate:** This case uses a medium dataset (1M-768dim, 500K-1536dim) with a similar low filtering rate.
-- **Large Dataset, High Filtering Rate:** It tests with a large dataset (10M-768dim, 5M-1536dim) but under a high filtering rate (99% vectors).
-- **Medium Dataset, High Filtering Rate:** This case uses a medium dataset (1M-768dim, 500K-1536dim) with a high filtering rate.
-For a quick reference, here is a table summarizing the key aspects of each case:
-
-Case No. | Case Type | Dataset Size  | Filtering Rate | Results |
-|----------|-----------|--------------|----------------|---------|
-1 | Capacity Case | SIFT 500K vectors, 128 dimensions | N/A | Number of inserted vectors |
-2 | Capacity Case | GIST 100K vectors, 960 dimensions | N/A | Number of inserted vectors |
-3 | Search Performance Case | LAION 100M vectors, 768 dimensions | N/A | Index building time, recall, latency, maximum QPS |
-4 | Search Performance Case | Cohere 10M vectors, 768 dimensions | N/A | Index building time, recall, latency, maximum QPS |
-5 | Search Performance Case | Cohere 1M vectors, 768 dimensions | N/A | Index building time, recall, latency, maximum QPS |
-6 | Filtering Search Performance Case | Cohere 10M vectors, 768 dimensions | 1% vectors | Index building time, recall, latency, maximum QPS |
-7 | Filtering Search Performance Case | Cohere 1M vectors, 768 dimensions | 1% vectors | Index building time, recall, latency, maximum QPS |
-8 | Filtering Search Performance Case | Cohere 10M vectors, 768 dimensions | 99% vectors | Index building time, recall, latency, maximum QPS |
-9 | Filtering Search Performance Case | Cohere 1M vectors, 768 dimensions | 99% vectors | Index building time, recall, latency, maximum QPS |
-10 | Search Performance Case | OpenAI generated 500K vectors, 1536 dimensions | N/A | Index building time, recall, latency, maximum QPS |
-11 | Search Performance Case | OpenAI generated 5M vectors, 1536 dimensions | N/A | Index building time, recall, latency, maximum QPS |
-12 | Filtering Search Performance Case | OpenAI generated 500K vectors, 1536 dimensions | 1% vectors | Index building time, recall, latency, maximum QPS |
-13 | Filtering Search Performance Case | OpenAI generated 5M vectors, 1536 dimensions | 1% vectors | Index building time, recall, latency, maximum QPS |
-14 | Filtering Search Performance Case | OpenAI generated 500K vectors, 1536 dimensions | 99% vectors | Index building time, recall, latency, maximum QPS |
-15 | Filtering Search Performance Case | OpenAI generated 5M vectors, 1536 dimensions | 99% vectors | Index building time, recall, latency, maximum QPS |
-
+- **Int-Filter Cases:** Evaluates search performance with int-based filter expression (e.g.  "id >= 2,000").
+- **Label-Filter Cases:** Evaluates search performance with label-based filter expressions (e.g., "color == 'red'"). The test includes randomly generated labels to simulate real-world filtering scenarios.
+#### Streaming Cases
+- **Insertion-Under-Load Case:** Evaluates search performance while maintaining a constant insertion workload. VectorDBBench applies a steady stream of insert requests at a fixed rate to simulate real-world scenarios where search operations must perform reliably under continuous data ingestion.
 
 Each case provides an in-depth examination of a vector database's abilities, providing you a comprehensive view of the database's performance.
 
 
@@ -35,7 +35,7 @@ dependencies = [
     "psutil",
     "polars",
     "plotly",
-    "environs<14.1.0",
+    "environs",
     "pydantic<v2",
     "scikit-learn",
     "pymilvus", # with pandas, numpy, ujson
 
@@ -18,37 +18,16 @@ class config:
     DEFAULT_DATASET_URL = env.str("DEFAULT_DATASET_URL", AWS_S3_URL)
     DATASET_LOCAL_DIR = env.path("DATASET_LOCAL_DIR", "/tmp/vectordb_bench/dataset")
     NUM_PER_BATCH = env.int("NUM_PER_BATCH", 100)
+    TIME_PER_BATCH = 1  # 1s. for streaming insertion.
+    MAX_INSERT_RETRY = 5
+    MAX_SEARCH_RETRY = 5
+
+    LOAD_MAX_TRY_COUNT = 10
 
     DROP_OLD = env.bool("DROP_OLD", True)
     USE_SHUFFLED_DATA = env.bool("USE_SHUFFLED_DATA", True)
 
-    NUM_CONCURRENCY = env.list(
-        "NUM_CONCURRENCY",
-        [
-            1,
-            5,
-            10,
-            15,
-            20,
-            25,
-            30,
-            35,
-            40,
-            45,
-            50,
-            55,
-            60,
-            65,
-            70,
-            75,
-            80,
-            85,
-            90,
-            95,
-            100,
-        ],
-        subcast=int,
-    )
+    NUM_CONCURRENCY = env.list("NUM_CONCURRENCY", [1, 5, 10, 20, 30, 40, 60, 80], subcast=int)
 
     CONCURRENCY_DURATION = 30
 
@@ -68,21 +47,29 @@ class config:
 
     CAPACITY_TIMEOUT_IN_SECONDS = 24 * 3600  # 24h
     LOAD_TIMEOUT_DEFAULT = 24 * 3600  # 24h
+    LOAD_TIMEOUT_768D_100K = 24 * 3600  # 24h
     LOAD_TIMEOUT_768D_1M = 24 * 3600  # 24h
     LOAD_TIMEOUT_768D_10M = 240 * 3600  # 10d
     LOAD_TIMEOUT_768D_100M = 2400 * 3600  # 100d
 
     LOAD_TIMEOUT_1536D_500K = 24 * 3600  # 24h
     LOAD_TIMEOUT_1536D_5M = 240 * 3600  # 10d
 
+    LOAD_TIMEOUT_1024D_1M = 24 * 3600  # 24h
+    LOAD_TIMEOUT_1024D_10M = 240 * 3600  # 10d
+
     OPTIMIZE_TIMEOUT_DEFAULT = 24 * 3600  # 24h
+    OPTIMIZE_TIMEOUT_768D_100K = 24 * 3600  # 24h
     OPTIMIZE_TIMEOUT_768D_1M = 24 * 3600  # 24h
     OPTIMIZE_TIMEOUT_768D_10M = 240 * 3600  # 10d
     OPTIMIZE_TIMEOUT_768D_100M = 2400 * 3600  # 100d
 
     OPTIMIZE_TIMEOUT_1536D_500K = 24 * 3600  # 24h
     OPTIMIZE_TIMEOUT_1536D_5M = 240 * 3600  # 10d
 
+    OPTIMIZE_TIMEOUT_1024D_1M = 24 * 3600  # 24h
+    OPTIMIZE_TIMEOUT_1024D_10M = 240 * 3600  # 10d
+
     def display(self) -> str:
         return [
             i
 
@@ -1,7 +1,8 @@
 import logging
 
-from vectordb_bench.backend.clients import EmptyDBCaseConfig
+from vectordb_bench.backend.clients import DB, EmptyDBCaseConfig
 from vectordb_bench.backend.data_source import DatasetSource
+from vectordb_bench.backend.filter import FilterOp
 from vectordb_bench.models import TaskConfig
 
 from .cases import CaseLabel
@@ -10,6 +11,13 @@
 log = logging.getLogger(__name__)
 
 
+class FilterNotSupportedError(ValueError):
+    """Raised when a filter type is not supported by a vector database."""
+
+    def __init__(self, db_name: str, filter_type: FilterOp):
+        super().__init__(f"{filter_type} Filter test is not supported by {db_name}.")
+
+
 class Assembler:
     @classmethod
     def assemble(cls, run_id: str, task: TaskConfig, source: DatasetSource) -> CaseRunner:
@@ -39,25 +47,30 @@ def assemble_all(
         runners = [cls.assemble(run_id, task, source) for task in tasks]
         load_runners = [r for r in runners if r.ca.label == CaseLabel.Load]
         perf_runners = [r for r in runners if r.ca.label == CaseLabel.Performance]
+        streaming_runners = [r for r in runners if r.ca.label == CaseLabel.Streaming]
 
         # group by db
-        db2runner = {}
+        db2runner: dict[DB, list[CaseRunner]] = {}
         for r in perf_runners:
             db = r.config.db
             if db not in db2runner:
                 db2runner[db] = []
             db2runner[db].append(r)
 
-        # check dbclient installed
-        for k in db2runner:
-            _ = k.init_cls
+        # check
+        for db, runners in db2runner.items():
+            db_instance = db.init_cls
+            for runner in runners:
+                if not db_instance.filter_supported(runner.ca.filters):
+                    raise FilterNotSupportedError(db.value, runner.ca.filters.type)
 
         # sort by dataset size
         for _, runner in db2runner.items():
-            runner.sort(key=lambda x: x.ca.dataset.data.size)
+            runner.sort(key=lambda x: (x.ca.dataset.data.size, 0 if x.ca.filters.type == FilterOp.StrEqual else 1))
 
         all_runners = []
         all_runners.extend(load_runners)
+        all_runners.extend(streaming_runners)
         for v in db2runner.values():
             all_runners.extend(v)