Skip to content

Conversation

@Akhil-Pathivada
Copy link
Owner

@Akhil-Pathivada Akhil-Pathivada commented Dec 1, 2025

Summary

VectorDBBench supports custom datasets for performance tests (search-only benchmarks) but lacks support for custom datasets in streaming tests (concurrent insertion + search benchmarks). This prevents users from evaluating streaming performance on their domain-specific or proprietary datasets—a critical capability for production workload simulation.

This PR adds support for running streaming performance tests with custom datasets, following the same architectural pattern as custom performance tests.

Changes

  • Backend: Extended StreamingPerformanceCase to accept custom dataset configurations
  • Frontend - Run Test Page: Added new "Custom Streaming Test" cluster displaying custom streaming datasets as individual checkboxes
  • Frontend - Custom Page: Added dedicated streaming dataset management section with create/edit/delete functionality
  • Configuration: Added case_type field to differentiate streaming and performance datasets in custom_case.json
  • UI/UX: Simplified streaming dataset form by removing unnecessary fields (metric_type, scalar_labels, label_percentages)

New Files

  • vectordb_bench/frontend/components/custom/displayCustomStreamingCase.py

Modified Files

  • vectordb_bench/backend/cases.py
  • vectordb_bench/frontend/components/custom/getCustomConfig.py
  • vectordb_bench/frontend/config/dbCaseConfigs.py
  • vectordb_bench/frontend/components/run_test/caseSelector.py
  • vectordb_bench/frontend/pages/custom.py
  • vectordb_bench/custom/custom_case.json

User Workflow

  1. Navigate to /custom page
  2. Add new streaming dataset in "Streaming Test Datasets" section
  3. Configure dataset parameters (name, path, dimensions, file names, etc.)
  4. Save configuration
  5. Navigate to /run_test page
  6. Select custom streaming dataset from "Custom Streaming Test" cluster
  7. Configure streaming parameters (insert_rate, search_stages, concurrencies)
  8. Run test

Testing

  • Custom streaming dataset can be created and saved
  • Custom streaming dataset appears in "Custom Streaming Test" cluster
  • Streaming test runs successfully with custom dataset
  • Multiple training files are loaded correctly
  • UI properly displays all fields and matches performance dataset styling

Screenshots

[Add screenshots of the /custom page and /run_test page showing the new feature]

@Akhil-Pathivada Akhil-Pathivada force-pushed the feature/custom-streaming-test-cluster branch from 2fb70b1 to c332f90 Compare December 1, 2025 17:46
@Akhil-Pathivada Akhil-Pathivada changed the title feat: Add Custom streaming test feat: Add Custom Dataset support for Streaming Tests Dec 1, 2025
@Akhil-Pathivada Akhil-Pathivada force-pushed the feature/custom-streaming-test-cluster branch 2 times, most recently from 53bf728 to e67bf79 Compare December 1, 2025 17:51
@Akhil-Pathivada Akhil-Pathivada force-pushed the feature/custom-streaming-test-cluster branch 2 times, most recently from 2ea9fd8 to 07ee9ca Compare December 1, 2025 18:45
@Akhil-Pathivada Akhil-Pathivada force-pushed the feature/custom-streaming-test-cluster branch from 07ee9ca to c1c36a1 Compare December 1, 2025 18:47
@Akhil-Pathivada Akhil-Pathivada marked this pull request as ready for review December 4, 2025 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants