Skip to content

HNSW index update with CDC #21917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 246 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
246 commits
Select commit Hold shift + click to select a range
33bf455
insert/delete/replace cdc sql
cpegeric Apr 24, 2025
8ada8bd
add hnswsync
cpegeric Apr 25, 2025
2d85277
get batch
cpegeric Apr 25, 2025
30db246
tabledef
cpegeric Apr 25, 2025
1cfebd2
add param
cpegeric Apr 25, 2025
4165d42
add param and comments
cpegeric Apr 25, 2025
a25bf32
add cdc delete
cpegeric Apr 28, 2025
a91f165
function
cpegeric Apr 28, 2025
a1ce79c
add hnsw function
cpegeric Apr 28, 2025
035ba59
cdc param
cpegeric Apr 29, 2025
4bf9ed9
cdc param
cpegeric Apr 29, 2025
e11c07e
merge fix
cpegeric Apr 29, 2025
5a412a2
bug fix
cpegeric Apr 29, 2025
1b9b50a
support hnswsync
cpegeric Apr 29, 2025
78d842f
add sqlexecutor
cpegeric Apr 30, 2025
fcb5ed2
txn
cpegeric May 1, 2025
64d5cb1
check rollback error and skip
cpegeric May 1, 2025
ab9d98c
cleanup
cpegeric May 1, 2025
797d78a
merge fix
cpegeric May 9, 2025
07b5f65
Merge branch 'main' into cdc_sqlexecutor
cpegeric May 14, 2025
0838cb0
rename Hnsw to VectorIndex
cpegeric May 14, 2025
28cdb7c
refactor code
cpegeric May 14, 2025
cf25e24
refactoring
cpegeric May 14, 2025
df53b4f
add remove() and contains() and load from view
cpegeric May 14, 2025
509137a
unload
cpegeric May 14, 2025
fdeb4c1
add cdc sync
cpegeric May 14, 2025
f08aa29
CdcSync
cpegeric May 14, 2025
7f791a7
error checking
cpegeric May 14, 2025
cc15bbf
add dimension as function argument
cpegeric May 14, 2025
7fd2758
load metadata
cpegeric May 14, 2025
00050c1
update
cpegeric May 14, 2025
81a964e
update
cpegeric May 14, 2025
a7c6fc6
update
cpegeric May 15, 2025
f6c64b1
update
cpegeric May 15, 2025
af57e76
Merge branch 'main' into cdc_sqlexecutor
cpegeric May 15, 2025
1bb32a0
add txn
cpegeric May 15, 2025
b435ee4
runTxn
cpegeric May 15, 2025
827a42d
destroy
cpegeric May 15, 2025
6109470
update
cpegeric May 15, 2025
7f190b9
bug fix save to file
cpegeric May 15, 2025
62aa87e
bug fix with init index capacity
cpegeric May 15, 2025
e2278a3
bug fix dirty
cpegeric May 15, 2025
6e682b9
update Len and Capacity
cpegeric May 15, 2025
ed49fd1
update
cpegeric May 16, 2025
1e5b6a8
view
cpegeric May 16, 2025
36bc11f
bug fix view
cpegeric May 16, 2025
3d70d9a
add tests
cpegeric May 22, 2025
3d94c96
Merge branch 'main' into cdc_sqlexecutor
cpegeric May 22, 2025
3967bb6
update
cpegeric May 22, 2025
6db0cb6
add tests
cpegeric May 22, 2025
d123d51
add sync test
cpegeric May 23, 2025
35ca423
update tests
cpegeric May 23, 2025
dd2ecdb
delete all tests
cpegeric May 23, 2025
5356fd9
update tests
cpegeric May 23, 2025
69dffe1
add test
cpegeric May 23, 2025
48a3f59
add test
cpegeric May 23, 2025
4a0db76
update
cpegeric May 23, 2025
03b7f3b
remove InsertMeta
cpegeric May 23, 2025
fc450b1
start with empty
cpegeric May 23, 2025
862cd46
delete 2 files
cpegeric May 23, 2025
fd6eee9
shuffle test
cpegeric May 23, 2025
8c51bbe
update shuffle
cpegeric May 23, 2025
3e5f164
create/drop index
cpegeric May 27, 2025
2f00775
remove new cdc sql syntax
cpegeric May 27, 2025
4deaa70
Merge branch 'main' into cdc_sqlexecutor_cleanup
mergify[bot] May 27, 2025
9006006
update sca
cpegeric May 27, 2025
543ad89
Merge branch 'cdc_sqlexecutor_cleanup' of github.com:cpegeric/matrixo…
cpegeric May 27, 2025
c2a30cb
fix sca test
cpegeric May 27, 2025
04f097d
check nil vector
cpegeric May 27, 2025
d21748a
fix sca
cpegeric May 27, 2025
5747b92
bug fix
cpegeric May 27, 2025
ea5e8e6
add test
cpegeric Jun 2, 2025
390e907
clear cache after cdc sync
cpegeric Jun 2, 2025
66eef84
add bvt test
cpegeric Jun 2, 2025
eddf530
fix test
cpegeric Jun 2, 2025
3f355c7
update unittest
cpegeric Jun 2, 2025
53e2f96
bug fix check channel closed
cpegeric Jun 2, 2025
89efa85
fix sca
cpegeric Jun 2, 2025
f6609de
fix sca
cpegeric Jun 2, 2025
1575bd9
fix sca
cpegeric Jun 2, 2025
94549ac
check vector dimension when cast from string
cpegeric Jun 3, 2025
e34c04f
bypass dimension check when width is max dimension
cpegeric Jun 3, 2025
1f626f3
fix bvt
cpegeric Jun 3, 2025
22eee6b
test atomicBatch
cpegeric Jun 4, 2025
ecfeab8
check errors
cpegeric Jun 4, 2025
d9a6023
test sendX
cpegeric Jun 4, 2025
ea82266
update bvt with manual pitr and cdc task
cpegeric Jun 4, 2025
6539fff
increase sleep
cpegeric Jun 4, 2025
917356b
update
cpegeric Jun 4, 2025
397f518
add tests
cpegeric Jun 5, 2025
e74e66d
fix bvt test on multi-cn env
cpegeric Jun 5, 2025
8f5f976
add more tests
cpegeric Jun 5, 2025
598d60f
cleanup and remove stderr
cpegeric Jun 5, 2025
50be8db
update and comments
cpegeric Jun 5, 2025
b9ea603
Merge branch 'main' into cdc_sqlexecutor_cleanup
cpegeric Jun 5, 2025
9144730
debug
cpegeric Jun 6, 2025
e00b89a
fix sca
cpegeric Jun 6, 2025
fe08c31
Merge branch 'main' into cdc_sqlexecutor_cleanup
cpegeric Jun 9, 2025
6da9b68
add test
cpegeric Jun 9, 2025
b381acd
add license
cpegeric Jun 9, 2025
644fbfa
performance
cpegeric Jun 9, 2025
1f7654c
update
cpegeric Jun 9, 2025
8b6975c
update
cpegeric Jun 9, 2025
b8c3c15
fix thread safe
cpegeric Jun 10, 2025
a964de2
cleanup
cpegeric Jun 10, 2025
8b962d0
never unload when insertAll
cpegeric Jun 10, 2025
dc4cbbd
merge fix
cpegeric Jun 11, 2025
3d7be05
better message
cpegeric Jun 11, 2025
ae92606
Merge branch 'main' into cdc_sqlexecutor_cleanup
cpegeric Jun 12, 2025
652783c
fix bvt -- drop pitr
cpegeric Jun 12, 2025
c0f162b
cleanup
cpegeric Jun 12, 2025
4ce4a5d
take timing
cpegeric Jun 13, 2025
ba61a0f
support values as input
cpegeric Jun 19, 2025
67feed9
Comment on composite primary key
cpegeric Jun 19, 2025
1b8735b
comments
cpegeric Jun 19, 2025
a12d40d
add sql writer
cpegeric Jun 19, 2025
acb4387
update
cpegeric Jun 20, 2025
edc33bb
update
cpegeric Jun 20, 2025
0a1f61c
update
cpegeric Jun 20, 2025
c393b7c
update
cpegeric Jun 20, 2025
dabe24b
update
cpegeric Jun 20, 2025
56bc1df
update ivfflat
cpegeric Jun 20, 2025
5e0a2bd
update ivfflat
cpegeric Jun 20, 2025
56b6a28
update
cpegeric Jun 20, 2025
04097a4
add db
cpegeric Jun 20, 2025
9210d80
reset
cpegeric Jun 20, 2025
5aa4712
empty
cpegeric Jun 20, 2025
09de3b1
hnsw sql writer
cpegeric Jun 20, 2025
5a664e6
update hnsw
cpegeric Jun 20, 2025
77037ae
template
cpegeric Jun 20, 2025
5200881
update
cpegeric Jun 20, 2025
fa50232
add test
cpegeric Jun 20, 2025
cfe6a0b
update test
cpegeric Jun 20, 2025
0b04c8f
ivf
cpegeric Jun 20, 2025
da1d2ec
update
cpegeric Jun 20, 2025
581a07d
version
cpegeric Jun 20, 2025
09dc52d
update
cpegeric Jun 20, 2025
e04b63c
update
cpegeric Jun 20, 2025
17057d7
bug fix
cpegeric Jun 20, 2025
d437339
index sinker with IndexSqlWriter
cpegeric Jun 23, 2025
5c648f3
remove comment
cpegeric Jun 23, 2025
53b6438
rename hsnw to index
cpegeric Jun 23, 2025
8069eb0
delete if vector is nil
cpegeric Jun 23, 2025
4afbe23
rename file
cpegeric Jun 23, 2025
4772c2f
support multi-indexes
cpegeric Jun 23, 2025
12429c2
cleanup
cpegeric Jun 23, 2025
b32d846
cleanup
cpegeric Jun 23, 2025
dd93d28
use constant
cpegeric Jun 23, 2025
173589a
rename file
cpegeric Jun 23, 2025
a44aacf
todo
cpegeric Jun 23, 2025
9aa078c
bvt test
cpegeric Jun 23, 2025
01fc88d
cleanup
cpegeric Jun 23, 2025
a39fa82
cleanup
cpegeric Jun 23, 2025
613544e
sca
cpegeric Jun 23, 2025
1a38d8b
add license
cpegeric Jun 23, 2025
be20232
bug fix
cpegeric Jun 23, 2025
622a888
more comments
cpegeric Jun 23, 2025
95a9a5c
delete sql
cpegeric Jun 23, 2025
4f06629
cleanup
cpegeric Jun 23, 2025
3c34919
cleanup
cpegeric Jun 23, 2025
d60145c
bug fix delete row only have 1 column pk
cpegeric Jun 24, 2025
84d945c
bug fix pre-defined column name
cpegeric Jun 24, 2025
789fa58
hardcode composite primary key column to varbinary
cpegeric Jun 24, 2025
504b616
bug fix
cpegeric Jun 24, 2025
e97612c
disable fulltext and ivfflat
cpegeric Jun 24, 2025
3e5ead4
bug fix
cpegeric Jun 24, 2025
f64f6de
Merge branch 'cdc_sqlexecutor_cleanup' into cdc_fulltext
cpegeric Jun 24, 2025
2806c24
only enable hnsw
cpegeric Jun 24, 2025
9ba390e
add async option
cpegeric Jun 25, 2025
ef894ea
skip async with DML
cpegeric Jun 25, 2025
1de75a6
catalog.IsIndexAsync
cpegeric Jun 25, 2025
9f83f62
async
cpegeric Jun 25, 2025
4d9cca5
Merge branch 'cdc_fulltext' into cdc_sqlexecutor_cleanup
cpegeric Jun 25, 2025
4e527c4
update
cpegeric Jun 25, 2025
99ce5a6
fix sca
cpegeric Jun 25, 2025
0f89628
fix merge
cpegeric Jun 25, 2025
14b0e7f
Merge branch 'main' into cdc_fulltext
cpegeric Jun 26, 2025
5143b81
Merge branch 'main' into cdc_fulltext
cpegeric Jun 27, 2025
72231a7
add cdc util
cpegeric Jun 27, 2025
7f7e096
create/delete cdc task
cpegeric Jun 27, 2025
884102e
update
cpegeric Jun 27, 2025
82b76f8
update
cpegeric Jun 27, 2025
959b527
update
cpegeric Jun 27, 2025
662fd57
Merge branch 'cdc_fulltext' into cdc_sqlexecutor_cleanup
cpegeric Jun 27, 2025
c16cf90
truncate table
cpegeric Jun 27, 2025
b423bcc
truncate table
cpegeric Jun 27, 2025
20fb6e5
update
cpegeric Jun 27, 2025
5caab53
cleanup
cpegeric Jun 27, 2025
43e9116
update
cpegeric Jun 27, 2025
a4a753d
update
cpegeric Jun 27, 2025
42a9adc
hnsw disable alter reindex
cpegeric Jun 27, 2025
2ed6051
alter reindex
cpegeric Jun 27, 2025
164693e
sca
cpegeric Jun 27, 2025
7757192
bug fix
cpegeric Jun 27, 2025
38775df
bug fix
cpegeric Jun 27, 2025
852b8c2
update
cpegeric Jun 30, 2025
d7241ae
use pitr_name
cpegeric Jun 30, 2025
b6639d2
add check pitr before create
cpegeric Jun 30, 2025
49a4cef
update
cpegeric Jun 30, 2025
f09eb13
update
cpegeric Jun 30, 2025
22e91b3
update
cpegeric Jun 30, 2025
04393a1
consumer
cpegeric Jun 30, 2025
3400935
license
cpegeric Jun 30, 2025
11c74ff
update
cpegeric Jun 30, 2025
7d106fb
update
cpegeric Jun 30, 2025
d4e19e6
use transaction from DataRetriever
cpegeric Jul 1, 2025
1704875
update watermark
cpegeric Jul 1, 2025
4c4c3a8
update
cpegeric Jul 1, 2025
7b02d81
update
cpegeric Jul 1, 2025
16600d3
statement option
cpegeric Jul 1, 2025
ab06863
statement option
cpegeric Jul 1, 2025
e83058f
snapshot
cpegeric Jul 1, 2025
952b57a
run
cpegeric Jul 1, 2025
06af88a
update
cpegeric Jul 1, 2025
c72197d
move to idxcdc
cpegeric Jul 1, 2025
1d2b203
update
cpegeric Jul 1, 2025
ffa5da7
update
cpegeric Jul 1, 2025
fc4b7d2
update idxcdc
cpegeric Jul 1, 2025
3b0ca17
update
cpegeric Jul 1, 2025
fab81f1
tail use insert, snapshot use upsert
cpegeric Jul 1, 2025
268ab6e
update
cpegeric Jul 1, 2025
d06cb28
update
cpegeric Jul 1, 2025
76b2c62
mock retriever
cpegeric Jul 1, 2025
98439c8
flush at the end
cpegeric Jul 1, 2025
e0aca74
update test
cpegeric Jul 2, 2025
352655d
update
cpegeric Jul 2, 2025
2567793
add test
cpegeric Jul 2, 2025
ed0c94b
merge fix watermarkUpdater
cpegeric Jul 2, 2025
ccf384c
update
cpegeric Jul 2, 2025
9afc4df
Merge branch 'main' into cdc_fulltext
cpegeric Jul 3, 2025
0b6cfd9
add cnUUID
cpegeric Jul 3, 2025
97113c4
remove unneccessary code
cpegeric Jul 3, 2025
7259ddc
update
cpegeric Jul 3, 2025
5aa0926
api
cpegeric Jul 3, 2025
55258af
merge fix
cpegeric Jul 8, 2025
7a6ba7e
bug fix cdc
cpegeric Jul 8, 2025
a5ed284
bvt test
cpegeric Jul 8, 2025
839a494
fix drop index
cpegeric Jul 8, 2025
6ed579a
fix sca
cpegeric Jul 8, 2025
381665e
Merge branch 'main' into cdc_fulltext
cpegeric Jul 9, 2025
346f32c
Merge branch 'main' into cdc_sqlexecutor_cleanup
cpegeric Jul 9, 2025
9edbd11
bug fix thread id
cpegeric Jul 9, 2025
e744672
Merge branch 'cdc_sqlexecutor_cleanup' into cdc_fulltext
cpegeric Jul 9, 2025
f1b1962
rename idxcdc to iscp
cpegeric Jul 22, 2025
51a0044
merge fix function id
cpegeric Jul 22, 2025
44c63ff
fix sca
cpegeric Jul 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 38 additions & 3 deletions pkg/catalog/secondary_index_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ const (
HnswEfConstruction = "ef_construction"
HnswQuantization = "quantization"
HnswEfSearch = "ef_search"
Async = "async"
)

/* 1. ToString Functions */
Expand Down Expand Up @@ -133,6 +134,12 @@ func IndexParamsToStringList(indexParams string) (string, error) {
res += fmt.Sprintf(" %s '%s' ", IndexAlgoParamOpType, opType)
}

if val, ok := result[Async]; ok {
if val == "true" {
res += fmt.Sprintf(" %s ", Async)
}
}

return res, nil
}

Expand Down Expand Up @@ -179,10 +186,16 @@ func fullTextIndexParamsToMap(def *tree.FullTextIndex) (map[string]string, error
// fulltext index here
if def.IndexOption != nil {
parsername := strings.ToLower(def.IndexOption.ParserName)
if parsername != "ngram" && parsername != "default" && parsername != "json" && parsername != "json_value" {
return nil, moerr.NewInternalErrorNoCtx(fmt.Sprintf("invalid parser %s", parsername))
if len(parsername) > 0 {
if parsername != "ngram" && parsername != "default" && parsername != "json" && parsername != "json_value" {
return nil, moerr.NewInternalErrorNoCtx(fmt.Sprintf("invalid parser %s", parsername))
}
res["parser"] = parsername
}

if def.IndexOption.Async {
res[Async] = "true"
}
res["parser"] = parsername
}
return res, nil
}
Expand Down Expand Up @@ -224,6 +237,10 @@ func indexParamsToMap(def interface{}) (map[string]string, error) {
} else {
res[IndexAlgoParamOpType] = metric.OpType_L2Distance // set l2 as default
}

if idx.IndexOption.Async {
res[Async] = "true"
}
case tree.INDEX_TYPE_HNSW:
if idx.IndexOption.HnswM < 0 {
return nil, moerr.NewInternalErrorNoCtx("invalid M. hnsw.M must be > 0")
Expand Down Expand Up @@ -265,6 +282,10 @@ func indexParamsToMap(def interface{}) (map[string]string, error) {
} else {
res[IndexAlgoParamOpType] = metric.OpType_L2Distance // set l2 as default
}

if idx.IndexOption.Async {
res[Async] = "true"
}
default:
return nil, moerr.NewInternalErrorNoCtx("invalid index alogorithm type")
}
Expand All @@ -281,6 +302,20 @@ func DefaultIvfIndexAlgoOptions() map[string]string {
return res
}

func IsIndexAsync(indexAlgoParams string) (bool, error) {
if len(indexAlgoParams) > 0 {
param, err := IndexParamsStringToMap(indexAlgoParams)
if err != nil {
return false, err
}
v, ok := param[Async]
if ok {
return v == "true", nil
}
}
return false, nil
}

//------------------------[END] IndexAlgoParams------------------------

// ------------------------[START] Aliaser------------------------
Expand Down
Loading
Loading