feat(tiered-storage): support multi-tier KV cache migration#209
feat(tiered-storage): support multi-tier KV cache migration#209shaohuaxi wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fd0da23160
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
fd0da23 to
c17fe5f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c17fe5f45f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (capacity <= 0) { | ||
| LOG_WITH_GR(DEBUG, "no positive capacity for source type of [%s]; skip migration", src_name.c_str()); | ||
| continue; |
There was a problem hiding this comment.
Honor group quota when type quota is absent
Automatic migration now requires a positive per-storage-type entry in quota_config; if an instance group only has the regular group quota.capacity (a valid existing configuration, since quota_config is optional), every migration strategy hits this branch and is skipped even when the group water level exceeds trigger_threshold. This makes water-level migration silently never start for groups that rely on the existing group-level quota.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Review Summary
This is a large, well-structured feature that adds a complete hot→cold migration pipeline across 41 files, layered sensibly (config → storage primitive → manager → reclaimer integration → admin API). The commit decomposition and test coverage are good for a first-pass review. The existing Codex review already captured 18 inline issues; the notes below cover distinct problems not yet raised.
New findings
Correctness
SubmitNonBlocking(CacheLocationCopyRequest)double-queues every copy task and silently drops the result future — likely never the correct behavior. (See inline onschedule_plan_executor.cc:562.)MigrationMarkMethod::mark_timeout_ms_defaults to0, which causes every mark to expire instantly on the first cleanup sweep when the field is omitted from JSON. (See inline onmigration_strategy.h:70.)- In
BOTHmode the admin API drops blocks rejected at admission (not just blocks rejected byBatchSubmit) instead of falling back to mark for them. (See inline onadmin_service_impl.cc:623.)
Defensive / API contract
DataStorageManager::Copydoes not validatesrc_uris.size() == dst_uris.size()before delegating to the backend, leaving that contract unenforced for future non-dummy backends. (See inline ondata_storage_manager.cc.)MigrationCompletedEventhardcodes"method": "copy"even in event contexts that will be shared with the mark path. (See inline onmigration_event.h:70.)
Operational risk
KEEP_BOTH+ Copy without a matching reclaim strategy produces un-reclaimable source replicas with no config-time warning. (See inline onmigration_strategy.cc:90.)
🤖 Generated by Qoder
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: deb7b967e9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const auto select_result = data_storage_selector_->SelectCacheWriteDataStorageBackend( | ||
| request_context, instance_info->instance_group_name()); | ||
| RETURN_IF_EC_NOT_OK_WITH_LOG(WARN, select_result.ec, "select storage backend failed"); |
There was a problem hiding this comment.
Don't require default storage for tiered writes
When every key being written has a valid tiered target, this still selects the normal/default write backend before looking at tiered_targets and returns on any selection error. If the hot/default candidates are disabled or over quota—the exact condition that can trigger mark-based migration—StartWriteCache fails before it can allocate on the cold target, so marked blocks cannot be drained to the tiered storage even though no default allocation is needed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b12adea978
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
b12adea to
715f5bc
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 715f5bc7b8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
c23d077 to
715f5bc
Compare
715f5bc to
d4ab700
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d4ab700b2b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
d4ab700 to
90f4881
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90f4881a6d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
90f4881 to
808e649
Compare
Summary