Skip to content

[MOD-10236] Add serialization to SVS index #716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

lerman25
Copy link
Collaborator

@lerman25 lerman25 commented Jul 1, 2025

This PR refactors the serialization logic for vector indexes by introducing a clear abstraction layer via the Serializer base class. Both HNSWSerializer and SVSSerializer now inherit from this interface, enabling:

  • Unified structure for save/load operations across index types
  • Encoding version support for backward/forward compatibility
  • Cleaner separation of shared vs. index-specific serialization logic
  • Easier extension for future index types

Each index (e.g., HNSWIndex, SVSIndex) implements its own saveIndexFields method to handle template- and implementation-specific data.


This PR also introduces support for saving/loading SVS indexes and validating their internal consistency.
Specifically:

  • Implements the Serializer interface for SVS, enabling saveIndex() and loadIndex() methods. (Using a commit created by @rfsaliev )
  • Adds a checkIntegrity() method for runtime validation of index structure and metadata.

@CLAassistant
Copy link

CLAassistant commented Jul 1, 2025

CLA assistant check
All committers have signed the CLA.

@lerman25 lerman25 marked this pull request as draft July 1, 2025 11:58
@lerman25 lerman25 force-pushed the omer-add-save-load-check branch from a3ee719 to 7af26da Compare July 1, 2025 12:15
@lerman25 lerman25 marked this pull request as ready for review July 2, 2025 10:55
@lerman25 lerman25 requested a review from meiravgri July 2, 2025 11:18
Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!
main comments are regarding visibility :)
missing review for svs_extensions.h and svs_utils.h, will go over with @rfsaliev

lerman25 added a commit that referenced this pull request Jul 3, 2025
@alonre24 alonre24 changed the title [MOD-7022] Add serialization to SVS index [MOD-10236] Add serialization to SVS index Jul 6, 2025
@lerman25 lerman25 force-pushed the omer-add-save-load-check branch from d7602a1 to 3fb87dd Compare July 7, 2025 09:09
Copy link
Collaborator

@rfsaliev rfsaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO a lot of unnecessary code duplication.

Copy link

codecov bot commented Jul 7, 2025

Codecov Report

Attention: Patch coverage is 90.15152% with 39 lines in your changes missing coverage. Please review.

Project coverage is 96.54%. Comparing base (dca6041) to head (265d238).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/VecSim/algorithms/svs/svs_serializer_impl.h 78.84% 22 Missing ⚠️
src/VecSim/algorithms/hnsw/hnsw_serializer_impl.h 94.91% 9 Missing ⚠️
src/VecSim/algorithms/svs/svs_serializer.h 45.45% 6 Missing ⚠️
src/VecSim/algorithms/svs/svs_serializer.cpp 95.00% 1 Missing ⚠️
src/VecSim/index_factories/svs_factory.cpp 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #716      +/-   ##
==========================================
- Coverage   96.84%   96.54%   -0.31%     
==========================================
  Files         122      125       +3     
  Lines        7393     7578     +185     
==========================================
+ Hits         7160     7316     +156     
- Misses        233      262      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!!!
Had a few comments :)
Also, we need to implment a complemantry logic to serialize SVSIndex from the tiered index, and a tiered index ctor that will recieve an svs index loaded from disk
But IMO should be done in a seperate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants