Ryan Lundy
Ryan Lundy

Reputation: 210190

Why does git rebase with no arguments work the way that it does?

Every so often I'll perform a git rebase with no arguments and discover that instead of rebasing against my configured upstream, Git has used the --fork-point option and rebased against...something else, causing my commits to disappear.

This is in line with what the documentation for git rebase says:

If <upstream> is not specified, the upstream configured in branch.<name>.remote and branch.<name>.merge options will be used (see git-config[1] for details) and the --fork-point option is assumed.

My question is: Why does it work this way? Personally, I feel that if I run git rebase with no options, I want to rebase against the configured upstream branch. If I wanted to rebase against something else, I'd say so.

Apparently the developers of Git think otherwise, so I'm wondering whether someone can explain the thinking in a way that'll help me remember this distinction.

Upvotes: 2

Views: 1788

Answers (3)

adam.hendry
adam.hendry

Reputation: 5653

In general, the signature of git rebase is

git rebase --onto <newbase> <upstream> <branch>

where the commits between <branch> and the merge base of <branch> and <upstream> (i.e. the value returned from the plumbing command git merge-base) are replayed onto <newbase>. (merge base is so-called because it returns the would-be <base> commit of a 3-way merge between <branch> and <upstream>; "3" in "3-way" because such merges involve the tip commits <branch> and <upstream> and base <base>). In this context, we can think of merge base as "fork point".

Typically, the user checks out the branch they want to rebase and specifies the upstream in the command with

git rebase <upstream>

which gets interpreted as

git rebase --onto <upstream> <upstream> HEAD.

If the branch to rebase has a tracking branch, rebase can be called by itself without options or arguments and it will be replayed onto the tracking branch (e.g. origin/branch) as

git rebase --onto <origin/branch> <origin/branch> HEAD

In this case though, rebase replays the commits between HEAD and git merge-base --fork-point of HEAD and origin/branch, which is slightly different than the git merge-base of HEAD and origin/branch. AFAIK, this helps when the branch and remote used to point to the same base, but the remote later got replayed onto a different base.

Whenever upstream is specified, --fork-point is not used by default, so my best guess is that the Git developers wanted --fork-point to be default when a remote upstream is specified and they didn't have a way to differentiate between a local upstream and a remote upstream. A solution for them would then be to introduce git rebase by itself, switching the default option from --no-fork-point to --fork-point in that case, and only make it work when a tracking branch is set for the branch.

Still, I think letting the command run successfully without options is dangerous (as it may be run by accident before options and arguments are entered). If you'd like to run git rebase by itself and have it run as it does when you specify an upstream, you have to explicitly supply --no-fork-point (i.e. git rebase --no-fork-point). Again, this only works when the branch has a tracking branch.

Upvotes: 1

ElpieKay
ElpieKay

Reputation: 30878

I made a test with 2 local repos.

#simulate a remote repo
git init server
cd server
> a
git add .
git commit -m 'a'
git branch new
> b
git add .
git commit -m 'b'
git checkout new
> c
git add .
git commit -m 'c'
git checkout master
cd -

And the graph by git log --oneline --graph --decorate is:

* 912e28a (HEAD -> master) b
| * 8c71449 (new) c
|/
* bd95493 a

The other repo that simulates a local one:

git clone server -- client
cd client
git reset HEAD^ --hard
> d
git add .
git commit -m 'd'

And the graph:

* 7173a5f (HEAD -> master) d
| * 912e28a (origin/master, origin/HEAD) b
|/
| * 8c71449 (origin/new) c
|/
* bd95493 a

origin/master is the upstream branch of master. Now run git rebase and the graph becomes:

* 3bc57c5 (HEAD -> master) d
* 912e28a (origin/master, origin/HEAD) b
| * 8c71449 (origin/new) c
|/
* bd95493 a

The graph is the same result of git rebase origin/master or git rebase origin/master master.

Then make another try with the upstream branch changed:

#go back to the graph before the rebase
git reset 7173a5f --hard
#change the upstream
git config branch.master.merge refs/heads/new
git status
#try rebase again, without arguments
git rebase

The graph turns to:

* b3bfd13 (HEAD -> master) d
* 8c71449 (origin/new) c
| * 912e28a (origin/master, origin/HEAD) b
|/
* bd95493 a

as if git rebase origin/new or git rebase origin/new master has been run.

So I think it does behave as you've been expecting, rebasing the current branch against its upstream. I wonder which commits were gone in your case.

Merge commits disappear without -p or --preserve-merges. If the upstream has equivalent commits already, commits on the current branch are gone too.

Upvotes: 1

Mark Adelsberger
Mark Adelsberger

Reputation: 45679

The phrasing of the question ("rebased against... something else") suggests that you either aren't sure of, or don't buy into, what fork-point is supposed to do. If we start from a clear picture of what it does (and if we assume it works properly and doesn't have destructive side-effects), then the motivation for using it as the default might seem more clear.

(I'll be honest, I'd never bothered to figure out what fork-point does prior to looking at this question. The best explanation seems to be in the git merge-base documentation - https://git-scm.com/docs/git-merge-base.)

Before getting into the weeds, two "big picture" observations:

1) I don't think the git developers would agree that using fork-point means you're "rebasing against something else"; I think they'd probably say that you're still rebasing against the upstream, but are being more selective about which commits to rewrite on the new base.

2) While fork-point may not always be useful, it's not clear to me in what scenario it would cause your commits to disappear[1]. It would be interesting to see any minimal test cases that show problems (since above I mentioned "assume no destructive side-effecdts" as a condition of thinking this is a reasonable default).

So getting into it...

What is fork-point trying to do?

The short answer is, if a commit was previously part of the upstream and was removed from the upstream via history edit, then --fork-point tells rebase to leave that commit behind on the grounds that (1) it's not really part of the branch, (2) it's been rejected from the upstream already, and (3) we probably aren't trying to undo someone's history rewrite.

It does this by inferring information from the reflogs. Because the reflogs are local and temporary, this is a bit of a kludge; your attempt to rebase could get a different result from someone else's attempt to perform the same rebase in a seemingly-identical clone of the repo.

And if your reflogs seem to suggest that what is in fact your commit was once part of the upstream, that could cause a problem.

So why use it as a default?

I guess the bottom line is, the developers assume that if a commit was previously in the upstream and was removed from the upstream, that removal was definitive. Whether that is "usually true" may depend on the developer.

The odd thing is, the example used in the merge-base docs (cited above) seems like an odd one. It shows the upstream as origin/master, which implies that at some point the remote's master was rewritten - which is an upstream rebase situation, which is generally discouraged.

For unrelated reasons, I'm in the habit of always specifying my upstream when I rebase. This means, depending on how you look at it, that I miss out on the benefit of and/or am never subjected to the risk of a default fork-point option.


[1] I did come up with one potential case, but I'm not sure if it's the case you're running into. In the event you're on master, and you start developing a feature, and only after committing you realize that you forgot to create the feature branch; so you create the branch "in place" and then rewind master to un-mingle your changes; then rebasing the feature to master with fork-point could do the wrong thing.

One solution that might help with this case would be, after rewinding master, you could do a force rebase to regenerate your branch from fresh commits that are not in the master reflog.

Upvotes: 1

Related Questions