-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Cache submodule into git db #16246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache submodule into git db #16246
Conversation
|
r? @weihanglo rustbot has assigned @weihanglo. Use |
|
It would help reviewers if you break this down into atomic commits. For example, by making a commit for extracting See https://doc.crates.io/contrib/process/working-on-cargo.html#submitting-a-pull-request |
This comment has been minimized.
This comment has been minimized.
70350b1 to
ffd78e6
Compare
This comment has been minimized.
This comment has been minimized.
|
Done, the "extract code into update_db" part is a single commit now. I think the remains part needs to be in the same commit |
src/cargo/sources/git/utils.rs
Outdated
| .with_context(|| { | ||
| let name = child.name().unwrap_or(""); | ||
| format!("failed to fetch submodule `{name}` from {child_remote_url}",) | ||
| })?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should have this also on update_db and copy_to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried and the testsuite failed. Should I also adjust testsuite more? I do try to keep it unchanged as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on the failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about put it only after db.copy_to? I tested and it do pass the testsuite without further modification. The original fetch was replaced by (gitsource create, update_db and db.copy_to), so put the error context to after db.copy_to seem enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aeda351
Done, and adjust the testsuite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See an alternative in #16246 (comment)
src/cargo/sources/git/utils.rs
Outdated
| SourceId::from_url(&format!("git+{child_remote_url}#{head}"))?, | ||
| gctx, | ||
| RemoteKind::GitDependency, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What impact does this have on our progress bars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly the same? The old fetch will have a single progress bar, which now changes into a progress bar for update_db and a brief progress bar after for db.copy. This progress bars behavior should be the same as when update git source. Most of the time I don't even notice the brief one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ea4eb86 should fix when submodule database exist but still print "updating submodule"
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
We should have tests for this new caching behavior, especially for nested submodules. See https://doc.crates.io/contrib/process/working-on-cargo.html#making-a-change for how, and also mind the atomic commit pattern (adding test first capturing the existing behavior, and the fix commit showing behavior change through test diffs)
src/cargo/sources/git/utils.rs
Outdated
| &child_remote_url, | ||
| &reference, | ||
| let mut source = GitSource::new( | ||
| SourceId::from_url(&format!("git+{child_remote_url}#{head}"))?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of constructing URL manually, we have a SourceId::for_git for the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a474768, I notice the GitSource created by for_git always has precise=None, result to locked_rev being Deferred, so fetch_db always try to fetch if online. Is it appropriate to use with_git_precise for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Yeah would be good if we could add a comment why we need to have with_git_precise there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
02b8bf9 Done.
src/cargo/sources/git/utils.rs
Outdated
| let name = child.name().unwrap_or(""); | ||
| format!("failed to fetch submodule `{name}` from {child_remote_url}",) | ||
| })?; | ||
| guard.mark_ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checkout is marked ready then we recurse into submodules. I feel like that means the checkout is not ready yet.
Would this situation happens? _User cancels the operation before recursing to nested submodules. They later they dependency on this submodule directly and Cargo assumes it is fresh so never recurse into nested submodules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlook when inline db.copy_to, my bad. But now the code almost the same as db.copy_to, with db.copy has additional check for freshness, should I just use db.copy_to instead for reduce code dups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if that works (does look like will work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I need to push this myself? #16246 (review) If you can do it, please do. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am offering reorder commits, not writing a fix. Would appreciate if you can help write the fix :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
02b8bf9 Done.
|
@rustbot author |
|
Reminder, once the PR becomes ready for a review, use |
8705c44 to
a474768
Compare
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Sorry for the late response, I force-pushed the commits. About new tests, if I am not mistaking, it should be only about the git db cache, not the correctness of git repo checkouts, as it should be already covered? Currently I can't find api for manipulating the cached git db, or example about testing the git cache db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks like. Let me know if you plan to reorganize the commits, or I can do that for you. Thanks!
src/cargo/sources/git/utils.rs
Outdated
| let name = child.name().unwrap_or(""); | ||
| format!("failed to fetch submodule `{name}` from {child_remote_url}",) | ||
| })?; | ||
| guard.mark_ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if that works (does look like will work).
src/cargo/sources/git/utils.rs
Outdated
| &child_remote_url, | ||
| &reference, | ||
| let mut source = GitSource::new( | ||
| SourceId::from_url(&format!("git+{child_remote_url}#{head}"))?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Yeah would be good if we could add a comment why we need to have with_git_precise there.
If you can do it, please do. Thanks! |
a474768 to
fcb41ba
Compare
fetch each submodules using git database then checkout
fcb41ba to
cf022a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this PR try to resolve?
My attempt to continue #10279.
This:
GitSource::fetch_dbGitSource::fetch_dbanddb.copy_todb.copy_toalready recursive.Fixes #7987.
How to test and review this PR?
I tested using the original pull method:
and confirmed that the time to do the second update is negligible.
Also test if it can fetch submodule offline using the downloaded git db