Open
Conversation
Some rev-list options accepted by setup_revisions() are silently
ignored or actively counterproductive when used with 'git backfill',
because the path-walk API has its own tree-walking logic that bypasses
the mechanisms these options rely on:
* -S/-G (pickaxe) and --diff-filter work by computing per-commit
diffs in get_revision_1() and filtering commits whose diffs don't
match. Since backfill's goal is to download all blobs reachable
from commits in the range, filtering out commits based on diff
content would silently skip blobs -- the opposite of what users
want.
* --follow disables path pruning (revs->prune) and only makes
sense for tracking a single file through renames in log output.
It has no useful interaction with backfill.
* -L (line-log) computes line-level diffs to track the evolution
of a function or line range. Like pickaxe, it filters commits
based on diff content, which would cause blobs to be silently
skipped.
* --diff-merges controls how merge commit diffs are displayed.
The path-walk API walks trees directly and never computes
per-commit diffs, so this option would be silently ignored.
* --filter (object filtering, e.g. --filter=blob:none) is used by
the list-objects traversal but is completely ignored by the
path-walk API, so it would silently do nothing.
Rather than letting users think these options are being honored,
reject them with a clear error message.
Signed-off-by: Elijah Newren <newren@gmail.com>
302aff0 (backfill: accept revision arguments, 2026-03-26) added support for passing revision arguments to 'git backfill' but documented them only with a prose sentence: You may also specify the commit limiting options from git-rev-list(1). No other command that accepts revision arguments documents them this way. Commands like log, shortlog, and replay define a formal <revision-range> entry and include rev-list-options.adoc. Commands like bundle, fast-export, and filter-branch, which pass arguments through to the revision machinery without including the full options file, still define a formal <git-rev-list-args> entry explaining what is accepted. Add a formal <revision-range> entry in the synopsis and OPTIONS section, following the convention used by other commands, and mention that commit-limiting options from git-rev-list(1) are also accepted. Signed-off-by: Elijah Newren <newren@gmail.com>
Commit 302aff0 (backfill: accept revision arguments, 2026-03-26) added support for accepting revision arguments to backfill. This allows users to do things like git backfill --remotes ^v2.3.0 and then run many commands without triggering on-demand downloads of blobs. However, if they have topics based on v2.3.0, they will likely still trigger on-demand downloads. Consider, for example, the command git log -p v2.3.0..topic This would still trigger on-demand blob loadings after the backfill command above, because the commit(s) with A as a parent will need to diff against the blobs in A. In fact, multiple commands need blobs from the lower boundary of the revision range: * git log -p A..B # After backfill A..B * git replay --onto TARGET A..B # After backfill TARGET^! A..B * git checkout A && git merge B # After backfill A...B Add an extra --[no-]include-edges flag to allow grabbing blobs from edge commits. Since the point of backfill is to prevent on-demand blob loading and these are common commands, default to --include-edges. Signed-off-by: Elijah Newren <newren@gmail.com>
Author
|
/submit |
|
Submitted as pull.2088.git.1776297482.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
| @@ -78,6 +78,28 @@ static int fill_missing_blobs(const char *path UNUSED, | |||
| return 0; | |||
There was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 4/15/2026 7:58 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
>
> Some rev-list options accepted by setup_revisions() are silently
> ignored or actively counterproductive when used with 'git backfill',
> because the path-walk API has its own tree-walking logic that bypasses
> the mechanisms these options rely on:
>
> * -S/-G (pickaxe) and --diff-filter work by computing per-commit
> diffs in get_revision_1() and filtering commits whose diffs don't
> match. Since backfill's goal is to download all blobs reachable
> from commits in the range, filtering out commits based on diff
> content would silently skip blobs -- the opposite of what users
> want.
>
> * --follow disables path pruning (revs->prune) and only makes
> sense for tracking a single file through renames in log output.
> It has no useful interaction with backfill.
>
> * -L (line-log) computes line-level diffs to track the evolution
> of a function or line range. Like pickaxe, it filters commits
> based on diff content, which would cause blobs to be silently
> skipped.
I think these make a lot of sense, especially because these
computations require downloading missing blobs in order to find
the diffs that justify some of the choices of commit filtering.
> * --diff-merges controls how merge commit diffs are displayed.
> The path-walk API walks trees directly and never computes
> per-commit diffs, so this option would be silently ignored.
I think there are a few other "format" based options that were
silently ignored on purpose, because there's no output. Perhaps
we should change the use of options like this to a warning instead
of a failure?
> * --filter (object filtering, e.g. --filter=blob:none) is used by
> the list-objects traversal but is completely ignored by the
> path-walk API, so it would silently do nothing.
This is correct to remove because while it doesn't work with
path-walk right now, it might in the future. We don't want the
filter to mess with the functionality of 'git backfill' that sets
its own scope for which blobs to download.
> Rather than letting users think these options are being honored,
> reject them with a clear error message.
I agree that the majority of these should be hard failures. As
mentioned, some could be soft warnings. That could be an
adjustment to make in the future, so is not blocking for this
patch.
> +static void reject_unsupported_rev_list_options(struct rev_info *revs)
> +{
> + if (revs->diffopt.pickaxe)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + (revs->diffopt.pickaxe_opts & DIFF_PICKAXE_REGEX) ? "-G" : "-S");
> + if (revs->diffopt.filter || revs->diffopt.filter_not)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--diff-filter");
> + if (revs->diffopt.flags.follow_renames)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--follow");
> + if (revs->line_level_traverse)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "-L");
> + if (revs->explicit_diff_merges)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--diff-merges");
> + if (revs->filter.choice)
> + die(_("'%s' cannot be used with 'git backfill'"),
> + "--filter");
> +}
> +
My only nit-pick suggestion is to make the translated string a
macro so it can be more obvious that it is repeated exactly.
Thanks,
-Stolee
| @@ -9,7 +9,7 @@ git-backfill - Download missing objects in a partial clone | |||
| SYNOPSIS | |||
There was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 4/15/2026 7:58 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
>
> 302aff09223f (backfill: accept revision arguments, 2026-03-26) added
> support for passing revision arguments to 'git backfill' but documented
> them only with a prose sentence:
>
> You may also specify the commit limiting options from
> git-rev-list(1).
>
> No other command that accepts revision arguments documents them this
> way. Commands like log, shortlog, and replay define a formal
> <revision-range> entry and include rev-list-options.adoc. Commands like
> bundle, fast-export, and filter-branch, which pass arguments through to
> the revision machinery without including the full options file, still
> define a formal <git-rev-list-args> entry explaining what is accepted.
>
> Add a formal <revision-range> entry in the synopsis and OPTIONS section,
> following the convention used by other commands, and mention that
> commit-limiting options from git-rev-list(1) are also accepted.
Thanks for your attention to detail here. I like this version.
-Stolee| @@ -9,7 +9,7 @@ git-backfill - Download missing objects in a partial clone | |||
| SYNOPSIS | |||
There was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 4/15/2026 7:58 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> Add an extra --[no-]include-edges flag to allow grabbing blobs from
> edge commits. Since the point of backfill is to prevent on-demand blob
> loading and these are common commands, default to --include-edges.
I like this option and your motivation for including it.
> @@ -116,6 +117,8 @@ static int do_backfill(struct backfill_context *ctx)
> /* Walk from HEAD if otherwise unspecified. */
> if (!ctx->revs.pending.nr)
> add_head_to_pending(&ctx->revs);
> + if (ctx->include_edges)
> + ctx->revs.edge_hint = 1;
This would still work if...
> .revs = REV_INFO_INIT,
> + .include_edges = 1,
...this was initialized to -1 to allow for "no user option".
We don't need this change unless we were deciding to make a
config option that specified a different default. That seems
like overkill right now, so this doesn't need a change. Just
something that I like to think about.
I also like how your tests don't just verify the backfill
behavior but the ultimate behavior of 'git log' and friends
after the fact.
Thanks,
-Stolee
|
Derrick Stolee wrote on the Git mailing list (how to reply to this email): On 4/15/2026 7:57 PM, Elijah Newren via GitGitGadget wrote:
> This topic fixes a few minor issues in git backfill (from ds/backfill-revs
> this cycle), although some might see the third patch as more feature than
> fix, and the first two patches are pretty minor and probably do not merit
> consideration before the release this late in the cycle.
>
> Overview:
>
> * Patch 1: As a wise man once said, "Sending arbitrary command-line
> arguments to setup_revisions() creates an opportunity for behavior you
> are not expecting. For instance, can users...supply --first-parent? What
> happens if they add an --author filter?" ;-) I think these particular
> cases might work, but other rev-list options don't make sense, so let's
> error on ones that don't.
I know that --first-parent was one of the options I _did_ want to
include as a potential option (it helps focus the set to a "core" of
commits and we can get more on-demand off the core if needed). Yes,
--author is a little silly, but it didn't seem necessary to block it.
I agree with the reasons you gave to block _most_ of the options you
blocked. The output-formatting options don't need to be a hard failure,
but that could be a later improvement. For now, I think your change is
entirely positive so doesn't need change.
> * Patch 2: Making documentation more consistent with other commands
> * Patch 3: Tweak the ranges so we actually prevent on-demand blob
> downloading better with a new --[no-]include-edges flag.
I gave notes on every patch, but no meaningful changes are required.
Thanks for helping to polish this feature!
-Stolee
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This topic fixes a few minor issues in
git backfill(from ds/backfill-revs this cycle), although some might see the third patch as more feature than fix, and the first two patches are pretty minor and probably do not merit consideration before the release this late in the cycle.Overview:
cc: Derrick Stolee stolee@gmail.com