-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH] Start of a control interface for GC. #5218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This introduces an endpoint for garbage collecting collections from the command line. The goal is to inject a collection provided on the command line in the list of collections to clean up.
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
// This is a placeholder service. The garbage collector currently only exposes a health service. | ||
service GarbageCollector {} | ||
message KickoffGarbageCollectionRequest { | ||
string collection = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CompanyBestPractice]
The variable name 'collection' in the proto field and related code doesn't specify the identifier type being used. According to our naming convention guideline, specify whether this is a collection ID, name, or other identifier type for better readability and maintainability.
Consider renaming to be more specific:
collection_id
if using UUID/IDcollection_name
if using string namecollection_identifier
if the type varies
Affected files: idl/chromadb/proto/garbage_collector.proto:6, rust/garbage_collector/src/lib.rs:42
rust/garbage_collector/src/lib.rs
Outdated
&self, | ||
req: Request<KickoffGarbageCollectionRequest>, | ||
) -> Result<Response<KickoffGarbageCollectionResponse>, Status> { | ||
Err(Status::not_found("resource not found")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The implementation returns a generic "resource not found" error which doesn't provide useful information to callers. Consider using a more descriptive error message that indicates this is a placeholder implementation.
Err(Status::not_found("resource not found")) | |
Err(Status::unimplemented("garbage collection endpoint not yet implemented")) |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
1 Job Failed: PR checks / Python tests / test-cluster-rust-frontend (3.9, chromadb/test/property/test_add.py)No logs available for this step. Summary: 1 successful workflow, 1 failed workflow
Last updated: 2025-08-08 15:55:37 UTC |
Add Manual Garbage Collection Endpoint and CLI for GC Control This PR introduces a control interface enabling manual initiation of garbage collection for specific collections by UUID via a new gRPC endpoint and a command-line client. The implementation modifies the garbage collector service from a placeholder to a working endpoint that receives a collection_id and schedules that collection for immediate cleanup. Supporting changes include proto definition updates, Rust code for the GC controller/server, a new CLI tool, integration in the Kubernetes (Tiltfile) workflow, thread-safe manual collection tracking, and dependency additions. Key Changes• Introduced Affected Areas• garbage collector Rust crate (src/lib.rs, src/ This summary was automatically generated by @propel-code-bot |
#[tokio::main] | ||
async fn main() { | ||
let args = std::env::args().skip(1).collect::<Vec<_>>(); | ||
if args.len() != 2 { | ||
eprintln!("USAGE: chroma-manual-gc [HOST] [COLLECTION_UUID]"); | ||
std::process::exit(13); | ||
} | ||
let gcservice = Channel::from_shared(args[0].clone()) | ||
.expect("could not create channel") | ||
.connect() | ||
.await | ||
.expect("could not connect to gc service"); | ||
let mut client = GarbageCollectorClient::new(gcservice); | ||
client | ||
.kickoff_garbage_collection(KickoffGarbageCollectionRequest { | ||
collection_id: args[1].clone(), | ||
}) | ||
.await | ||
.expect("could not kickoff gc"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
This new command-line tool has a few issues:
- It's not registered as a binary in
rust/garbage_collector/Cargo.toml
, so it won't be built. You'll need to add a[[bin]]
section for it. - The error handling relies on
.expect()
, which will cause the tool to panic on errors like network issues or an invalid host. This isn't very user-friendly. Channel::from_shared()
expects a valid URI (e.g.,http://localhost:50055
). If a user provides justlocalhost:50055
, it will panic. The tool should prependhttp://
to the host argument.
Here's a suggestion that addresses the error handling and URI issues to make the tool more robust:
#[tokio::main] | |
async fn main() { | |
let args = std::env::args().skip(1).collect::<Vec<_>>(); | |
if args.len() != 2 { | |
eprintln!("USAGE: chroma-manual-gc [HOST] [COLLECTION_UUID]"); | |
std::process::exit(13); | |
} | |
let gcservice = Channel::from_shared(args[0].clone()) | |
.expect("could not create channel") | |
.connect() | |
.await | |
.expect("could not connect to gc service"); | |
let mut client = GarbageCollectorClient::new(gcservice); | |
client | |
.kickoff_garbage_collection(KickoffGarbageCollectionRequest { | |
collection_id: args[1].clone(), | |
}) | |
.await | |
.expect("could not kickoff gc"); | |
} | |
#[tokio::main] | |
async fn main() { | |
let args: Vec<String> = std::env::args().skip(1).collect(); | |
if args.len() != 2 { | |
eprintln!("USAGE: chroma-manual-gc [HOST:PORT] [COLLECTION_UUID]"); | |
std::process::exit(1); | |
} | |
let host = &args[0]; | |
let collection_id = &args[1]; | |
let endpoint = format!("http://{}", host); | |
let channel = match Channel::from_shared(endpoint) { | |
Ok(channel) => channel, | |
Err(e) => { | |
eprintln!("Invalid host format '{}': {}", host, e); | |
std::process::exit(1); | |
} | |
}; | |
let gcservice = match channel.connect().await { | |
Ok(service) => service, | |
Err(e) => { | |
eprintln!("Could not connect to gc service at '{}': {}", host, e); | |
std::process::exit(1); | |
} | |
}; | |
let mut client = GarbageCollectorClient::new(gcservice); | |
if let Err(e) = client | |
.kickoff_garbage_collection(KickoffGarbageCollectionRequest { | |
collection_id: collection_id.to_string(), | |
}) | |
.await | |
{ | |
eprintln!("Failed to kickoff gc: {}", e); | |
std::process::exit(1); | |
} | |
println!("Successfully kicked off garbage collection for collection {}", collection_id); | |
} |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
fn manual_garbage_collection_request( | ||
&self, | ||
collection_id: CollectionUuid, | ||
) -> Result<(), GarbageCollectCollectionError> { | ||
tracing::event!(Level::INFO, name = "manual garbage collection", collection_id =? collection_id); | ||
let mut manual_collections = self.manual_collections.lock(); | ||
manual_collections.insert(collection_id); | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
While the new endpoint correctly adds collection IDs to the manual_collections
set, the core logic to actually process these collections seems to be missing. The handle
implementation for GarbageCollectMessage
fetches collections to GC from sysdb
but doesn't appear to use the manual_collections
set.
To complete this feature, the GarbageCollectMessage
handler needs to be updated to:
- Read the collection IDs from
self.manual_collections
. - Fetch the necessary
CollectionToGcInfo
for these IDs fromsysdb
. - Add these collections to the list of collections to be garbage collected in the current run.
You might need to add a new method to SysDb
to fetch CollectionToGcInfo
for a specific list of collection IDs, as a suitable method doesn't seem to exist yet.
_: &ComponentContext<GarbageCollector>, | ||
) { | ||
if let Err(err) = self.manual_garbage_collection_request(req.collection_id) { | ||
tracing::event!(Level::ERROR, name = "manual compaction failed", error =? err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The log message here seems to have a copy-paste error from a compaction-related component. It should refer to "garbage collection" instead of "compaction".
tracing::event!(Level::ERROR, name = "manual compaction failed", error =? err); | |
tracing::event!(Level::ERROR, name = "manual garbage collection failed", error =? err); |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Description of changes
This introduces an endpoint for garbage collecting collections from the
command line. The goal is to inject a collection provided on the
command line in the list of collections to clean up.
Test plan
CI
Migration plan
N/A
Observability plan
I plan to add the endpoint and then observe the world as I use it.
Documentation Changes
N/A