forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 184
Backfill fixes and edges #2088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
newren
wants to merge
3
commits into
gitgitgadget:master
Choose a base branch
from
newren:backfill-fixes-and-edges
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+153
−10
Open
Backfill fixes and edges #2088
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ git-backfill - Download missing objects in a partial clone | |
| SYNOPSIS | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Derrick Stolee wrote on the Git mailing list (how to reply to this email): On 4/15/2026 7:58 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> Add an extra --[no-]include-edges flag to allow grabbing blobs from
> edge commits. Since the point of backfill is to prevent on-demand blob
> loading and these are common commands, default to --include-edges.
I like this option and your motivation for including it.
> @@ -116,6 +117,8 @@ static int do_backfill(struct backfill_context *ctx)
> /* Walk from HEAD if otherwise unspecified. */
> if (!ctx->revs.pending.nr)
> add_head_to_pending(&ctx->revs);
> + if (ctx->include_edges)
> + ctx->revs.edge_hint = 1;
This would still work if...
> .revs = REV_INFO_INIT,
> + .include_edges = 1,
...this was initialized to -1 to allow for "no user option".
We don't need this change unless we were deciding to make a
config option that specified a different default. That seems
like overkill right now, so this doesn't need a change. Just
something that I like to think about.
I also like how your tests don't just verify the backfill
behavior but the ultimate behavior of 'git log' and friends
after the fact.
Thanks,
-Stolee
|
||
| -------- | ||
| [synopsis] | ||
| git backfill [--min-batch-size=<n>] [--[no-]sparse] | ||
| git backfill [--min-batch-size=<n>] [--[no-]sparse] [--[no-]include-edges] [<revision-range>] | ||
|
|
||
| DESCRIPTION | ||
| ----------- | ||
|
|
@@ -43,7 +43,7 @@ smaller network calls than downloading the entire repository at clone | |
| time. | ||
|
|
||
| By default, `git backfill` downloads all blobs reachable from the `HEAD` | ||
| commit. This set can be restricted or expanded using various options. | ||
| commit. This set can be restricted or expanded using various options below. | ||
|
|
||
| THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR MAY CHANGE IN THE FUTURE. | ||
|
|
||
|
|
@@ -63,7 +63,23 @@ OPTIONS | |
| current sparse-checkout. If the sparse-checkout feature is enabled, | ||
| then `--sparse` is assumed and can be disabled with `--no-sparse`. | ||
|
|
||
| You may also specify the commit limiting options from linkgit:git-rev-list[1]. | ||
| `--include-edges`:: | ||
| `--no-include-edges`:: | ||
| Include blobs from boundary commits in the backfill. Useful in | ||
| preparation for commands like `git log -p A..B` or `git replay | ||
| --onto TARGET A..B`, where A..B normally excludes A but you need | ||
| the blobs from A as well. `--include-edges` is the default. | ||
|
|
||
| `<revision-range>`:: | ||
| Backfill only blobs reachable from commits in the specified | ||
| revision range. When no _<revision-range>_ is specified, it | ||
| defaults to `HEAD` (i.e. the whole history leading to the | ||
| current commit). For a complete list of ways to spell | ||
| _<revision-range>_, see the "Specifying Ranges" section of | ||
| linkgit:gitrevisions[7]. | ||
| + | ||
| You may also use commit-limiting options understood by | ||
| linkgit:git-rev-list[1] such as `--first-parent`, `--since`, or pathspecs. | ||
|
|
||
| SEE ALSO | ||
| -------- | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,7 +26,7 @@ | |
| #include "path-walk.h" | ||
|
|
||
| static const char * const builtin_backfill_usage[] = { | ||
| N_("git backfill [--min-batch-size=<n>] [--[no-]sparse]"), | ||
| N_("git backfill [--min-batch-size=<n>] [--[no-]sparse] [--[no-]include-edges] [<revision-range>]"), | ||
| NULL | ||
| }; | ||
|
|
||
|
|
@@ -35,6 +35,7 @@ struct backfill_context { | |
| struct oid_array current_batch; | ||
| size_t min_batch_size; | ||
| int sparse; | ||
| int include_edges; | ||
| struct rev_info revs; | ||
| }; | ||
|
|
||
|
|
@@ -78,6 +79,28 @@ static int fill_missing_blobs(const char *path UNUSED, | |
| return 0; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Derrick Stolee wrote on the Git mailing list (how to reply to this email): On 4/15/2026 7:58 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
>
> Some rev-list options accepted by setup_revisions() are silently
> ignored or actively counterproductive when used with 'git backfill',
> because the path-walk API has its own tree-walking logic that bypasses
> the mechanisms these options rely on:
>
> * -S/-G (pickaxe) and --diff-filter work by computing per-commit
> diffs in get_revision_1() and filtering commits whose diffs don't
> match. Since backfill's goal is to download all blobs reachable
> from commits in the range, filtering out commits based on diff
> content would silently skip blobs -- the opposite of what users
> want.
>
> * --follow disables path pruning (revs->prune) and only makes
> sense for tracking a single file through renames in log output.
> It has no useful interaction with backfill.
>
> * -L (line-log) computes line-level diffs to track the evolution
> of a function or line range. Like pickaxe, it filters commits
> based on diff content, which would cause blobs to be silently
> skipped.
I think these make a lot of sense, especially because these
computations require downloading missing blobs in order to find
the diffs that justify some of the choices of commit filtering.
> * --diff-merges controls how merge commit diffs are displayed.
> The path-walk API walks trees directly and never computes
> per-commit diffs, so this option would be silently ignored.
I think there are a few other "format" based options that were
silently ignored on purpose, because there's no output. Perhaps
we should change the use of options like this to a warning instead
of a failure?
> * --filter (object filtering, e.g. --filter=blob:none) is used by
> the list-objects traversal but is completely ignored by the
> path-walk API, so it would silently do nothing.
This is correct to remove because while it doesn't work with
path-walk right now, it might in the future. We don't want the
filter to mess with the functionality of 'git backfill' that sets
its own scope for which blobs to download.
> Rather than letting users think these options are being honored,
> reject them with a clear error message.
I agree that the majority of these should be hard failures. As
mentioned, some could be soft warnings. That could be an
adjustment to make in the future, so is not blocking for this
patch.
> +static void reject_unsupported_rev_list_options(struct rev_info *revs)
> +{
> + if (revs->diffopt.pickaxe)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + (revs->diffopt.pickaxe_opts & DIFF_PICKAXE_REGEX) ? "-G" : "-S");
> + if (revs->diffopt.filter || revs->diffopt.filter_not)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--diff-filter");
> + if (revs->diffopt.flags.follow_renames)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--follow");
> + if (revs->line_level_traverse)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "-L");
> + if (revs->explicit_diff_merges)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--diff-merges");
> + if (revs->filter.choice)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--filter");
> +}
> +
My only nit-pick suggestion is to make the translated string a
macro so it can be more obvious that it is repeated exactly.
Thanks,
-Stolee
|
||
| } | ||
|
|
||
| static void reject_unsupported_rev_list_options(struct rev_info *revs) | ||
| { | ||
| if (revs->diffopt.pickaxe) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| (revs->diffopt.pickaxe_opts & DIFF_PICKAXE_REGEX) ? "-G" : "-S"); | ||
| if (revs->diffopt.filter || revs->diffopt.filter_not) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| "--diff-filter"); | ||
| if (revs->diffopt.flags.follow_renames) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| "--follow"); | ||
| if (revs->line_level_traverse) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| "-L"); | ||
| if (revs->explicit_diff_merges) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| "--diff-merges"); | ||
| if (revs->filter.choice) | ||
| die(_("'%s' cannot be used with 'git backfill'"), | ||
| "--filter"); | ||
| } | ||
|
|
||
| static int do_backfill(struct backfill_context *ctx) | ||
| { | ||
| struct path_walk_info info = PATH_WALK_INFO_INIT; | ||
|
|
@@ -94,6 +117,8 @@ static int do_backfill(struct backfill_context *ctx) | |
| /* Walk from HEAD if otherwise unspecified. */ | ||
| if (!ctx->revs.pending.nr) | ||
| add_head_to_pending(&ctx->revs); | ||
| if (ctx->include_edges) | ||
| ctx->revs.edge_hint = 1; | ||
|
|
||
| info.blobs = 1; | ||
| info.tags = info.commits = info.trees = 0; | ||
|
|
@@ -121,12 +146,15 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit | |
| .min_batch_size = 50000, | ||
| .sparse = -1, | ||
| .revs = REV_INFO_INIT, | ||
| .include_edges = 1, | ||
| }; | ||
| struct option options[] = { | ||
| OPT_UNSIGNED(0, "min-batch-size", &ctx.min_batch_size, | ||
| N_("Minimum number of objects to request at a time")), | ||
| OPT_BOOL(0, "sparse", &ctx.sparse, | ||
| N_("Restrict the missing objects to the current sparse-checkout")), | ||
| OPT_BOOL(0, "include-edges", &ctx.include_edges, | ||
| N_("Include blobs from boundary commits in the backfill")), | ||
| OPT_END(), | ||
| }; | ||
| struct repo_config_values *cfg = repo_config_values(the_repository); | ||
|
|
@@ -144,6 +172,7 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit | |
|
|
||
| if (argc > 1) | ||
| die(_("unrecognized argument: %s"), argv[1]); | ||
| reject_unsupported_rev_list_options(&ctx.revs); | ||
|
|
||
| repo_config(repo, git_default_config, NULL); | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):