Skip to content

[MOD-10236] Add serialization to SVS index #716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 53 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
faddfd0
generalize
lerman25 Jul 3, 2025
a72055f
remove serializer.cpp from cmake
lerman25 Jul 5, 2025
27ec92d
prepare merge with rafik commit
lerman25 Jul 5, 2025
8bbf95e
[SVS] Implement Save/Load + test
rfsaliev Jun 30, 2025
448c4df
seperate hnsw_serializer to h and cpp
lerman25 Jul 6, 2025
528e436
remove get version impl
lerman25 Jul 6, 2025
0105552
save impl
lerman25 Jul 6, 2025
963093b
add load
lerman25 Jul 6, 2025
4b9640b
change camelcase
lerman25 Jul 6, 2025
5154cdf
for mat
lerman25 Jul 6, 2025
47d833e
generalzie saveIndexFields
lerman25 Jul 6, 2025
bcbd3ce
format
lerman25 Jul 6, 2025
b7ec512
compare metadata on load
lerman25 Jul 6, 2025
05c65b7
Add checkIntegrity with error
lerman25 Jul 6, 2025
e264e09
checkIntegrity
lerman25 Jul 6, 2025
a1cd915
remove duplicate verification in compare meta data
lerman25 Jul 6, 2025
51cdfd6
format
lerman25 Jul 6, 2025
9ed7730
svs serializetion version testing
lerman25 Jul 7, 2025
d2b44fe
Revert "svs serializetion version testing"
lerman25 Jul 7, 2025
987029a
common serializer test
lerman25 Jul 7, 2025
f0210a3
remove changes_num from metadata
lerman25 Jul 7, 2025
e4131b3
Add location c'tor
lerman25 Jul 7, 2025
88b0a19
Add location ctor and to test
lerman25 Jul 7, 2025
d1b6f9d
Remove outdated comment from serializer header
lerman25 Jul 7, 2025
08d850f
Enhance documentation for loadIndex function in SVSIndex
lerman25 Jul 7, 2025
be924ff
Add comments
lerman25 Jul 7, 2025
7798585
format + remove test
lerman25 Jul 7, 2025
55c4fb3
enable tests
lerman25 Jul 7, 2025
3fb87dd
serializer test
lerman25 Jul 7, 2025
0b2b8e3
format
lerman25 Jul 7, 2025
7d3538c
reset SVS to master
lerman25 Jul 7, 2025
322c012
add logging to test_svs
lerman25 Jul 7, 2025
e8b48b8
format
lerman25 Jul 7, 2025
4dba2ca
remove duplicate NewIndexImpl
lerman25 Jul 7, 2025
194e3f5
expose loadIndex in VecSimIndex, add BUILD_TEST gurad
lerman25 Jul 7, 2025
8f6c0e5
remove string ctor from SVSIndex
lerman25 Jul 7, 2025
15bb0bd
format
lerman25 Jul 7, 2025
029387f
Merge remote-tracking branch 'origin/main' into omer-add-save-load-check
lerman25 Jul 7, 2025
58ea170
fix BUILD_TEST in svs_factory
lerman25 Jul 8, 2025
fdacc76
document loadIndex
lerman25 Jul 8, 2025
c3cbee9
move loadIndex to serializer
lerman25 Jul 8, 2025
5e57b0c
remove excess declarations
lerman25 Jul 8, 2025
c1d11ca
remove extra ;
lerman25 Jul 8, 2025
0692601
compatable -> compatible
lerman25 Jul 8, 2025
99493e0
remove redundant params from test
lerman25 Jul 8, 2025
982b6d3
Merge remote-tracking branch 'origin' into omer-add-save-load-check
lerman25 Jul 21, 2025
2d6b00c
remove comments from threadpool_handle
lerman25 Jul 21, 2025
e29178d
remove error context comments
lerman25 Jul 21, 2025
2d64e12
add checkIntegrity
lerman25 Jul 21, 2025
8980513
update checkIntegrity and format
lerman25 Jul 21, 2025
f03f52b
move loadIndex to SVSSerializer
lerman25 Jul 22, 2025
b591807
update bindings
lerman25 Jul 22, 2025
265d238
format
lerman25 Jul 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/VecSim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ if (TARGET svs::svs)
endif()

if(VECSIM_BUILD_TESTS)
add_library(VectorSimilaritySerializer utils/serializer.cpp)
add_library(VectorSimilaritySerializer
algorithms/hnsw/hnsw_serializer.cpp
algorithms/svs/svs_serializer.cpp
)
target_link_libraries(VectorSimilarity VectorSimilaritySerializer)
endif()
5 changes: 3 additions & 2 deletions src/VecSim/algorithms/hnsw/hnsw.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#ifdef BUILD_TESTS
#include "hnsw_serialization_utils.h"
#include "VecSim/utils/serializer.h"
#include "hnsw_serializer.h"
#endif

#include <deque>
Expand Down Expand Up @@ -85,7 +86,7 @@ class HNSWIndex : public VecSimIndexAbstract<DataType, DistType>,
public VecSimIndexTombstone
#ifdef BUILD_TESTS
,
public Serializer
public HNSWSerializer
#endif
{
protected:
Expand Down Expand Up @@ -2324,5 +2325,5 @@ HNSWIndex<DataType, DistType>::getHNSWElementNeighbors(size_t label, int ***neig
}

#ifdef BUILD_TESTS
#include "hnsw_serializer.h"
#include "hnsw_serializer_impl.h"
#endif
2 changes: 1 addition & 1 deletion src/VecSim/algorithms/hnsw/hnsw_multi.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ class HNSWIndex_Multi : public HNSWIndex<DataType, DistType> {
HNSWIndex_Multi(std::ifstream &input, const HNSWParams *params,
const AbstractIndexInitParams &abstractInitParams,
const IndexComponents<DataType, DistType> &components,
Serializer::EncodingVersion version)
HNSWSerializer::EncodingVersion version)
: HNSWIndex<DataType, DistType>(input, params, abstractInitParams, components, version),
labelLookup(this->maxElements, this->allocator) {}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,38 +7,34 @@
* GNU Affero General Public License v3 (AGPLv3).
*/

#include <fstream>
#include <string>
#include "hnsw_serializer.h"

#include "VecSim/utils/serializer.h"

// Persist index into a file in the specified location.
void Serializer::saveIndex(const std::string &location) {

// Serializing with the latest version.
EncodingVersion version = EncodingVersion_V4;

std::ofstream output(location, std::ios::binary);
writeBinaryPOD(output, version);
saveIndexIMP(output);
output.close();
}

Serializer::EncodingVersion Serializer::ReadVersion(std::ifstream &input) {
HNSWSerializer::HNSWSerializer(EncodingVersion version) : m_version(version) {}

HNSWSerializer::EncodingVersion HNSWSerializer::ReadVersion(std::ifstream &input) {
input.seekg(0, std::ifstream::beg);

// The version number is the first field that is serialized.
EncodingVersion version = EncodingVersion_INVALID;
EncodingVersion version = EncodingVersion::INVALID;
readBinaryPOD(input, version);
if (version <= EncodingVersion_DEPRECATED) {

if (version <= EncodingVersion::DEPRECATED) {
input.close();
throw std::runtime_error("Cannot load index: deprecated encoding version: " +
std::to_string(version));
} else if (version >= EncodingVersion_INVALID) {
std::to_string(static_cast<int>(version)));
} else if (version >= EncodingVersion::INVALID) {
input.close();
throw std::runtime_error("Cannot load index: bad encoding version: " +
std::to_string(version));
std::to_string(static_cast<int>(version)));
}
return version;
}

void HNSWSerializer::saveIndex(const std::string &location) {
EncodingVersion version = EncodingVersion::V4;
std::ofstream output(location, std::ios::binary);
writeBinaryPOD(output, version);
saveIndexIMP(output);
output.close();
}

HNSWSerializer::EncodingVersion HNSWSerializer::getVersion() const { return m_version; }
Loading
Loading