Reputation: 10534
I know that I can use git merge-base
to determine the common ancestor when performing a git merge
, but it looks like this is not true for git rebase
Here is my setup before rebase
:
master branch
: A--Y--(C)
and dev branch
: A-----C--D
(C) is the outcome of me rebasing A--C onto A--Y, same content, but different commit message
git merge-base master dev
will return A, and if I do git merge dev
, I will see both (C) and C in my history
git rebase master
, outcome is:
A--Y--(C)--(D) where (D) is D after rebase
Does git rebase
consider (C) as the common ancestor? (This feels pretty hard to do in code) I am guessing it still uses A, but when it is cherry picking C,D to append to the end of master
, C ended up as a no-op?
Upvotes: 4
Views: 2434
Reputation: 490178
Let's start with this: what git rebase
does is to copy some commits. In your case, it appears to copy commit C
but not commit D
. (I think you are asking why, but are assuming that it has something to do with the merge base, and that is probably not correct.)
The set of commits that git rebase
chooses to copy is largely, but not entirely, determined by the result of:
git rev-list <upstream>..HEAD
where <upstream>
is the argument you pass to git rebase
, or the configured upstream. For instance, with git rebase master
, the <upstream>
in this sequence is master
. Your current branch is dev
, so HEAD
refers to dev
. So the set of commits to copy is that listed by:
git rev-list master..dev
(although there are additional options added; see below).
If I read your graph right, the input graph is:
...--A--Y <-- master
\
C--D <-- dev
so that the output of this git rev-list
is commits C
and D
.
Rebase then goes on to:
git checkout
the tip of the target branch (commit Y
here);git cherry-pick
on commits C
and D
; and, finallycheckout -B dev HEAD
(not quite literally but the same effect: get back on dev
, after moving dev
to point to the final copied commit).What I think you are asking about directly—which I think is not what you should be asking—is which commit is used as a merge base during cherry-picking. The answer here is that the merge base of a cherry-pick is the parent commit of that cherry-pick. So for the git cherry-pick C
step, the merge base, if a merge is required, is commit A
: C
's parent. For the git cherry-pick D
step, the merge base, if a merge is required, is commit C
: D
's parent.
Now, I listed several "additional options" items above. These refer to the commit selection process. In particular, git rebase
has two ways to toss commits off of its "to copy" list. (It also generates the list in the reverse of the usual backwards order, so that it does the cherry-picking in the forwards order instead.)
The main way that it eliminates commits from the to-copy list is to use git patch-id
. The git patch-id
command computes an ID number, which one hopes is "unique enough", based on the patch found by comparing a commit to its immediate parent, throwing away line numbers and some other items that might affect a cherry-picked commit. The result should be the same ID if the commit was already cherry-picked, but different if not.
Hence, after listing C
and D
, Git goes on to compare their patch-IDs to the patch-ID of commit Y
.
There are two commands that do this sort of thing, one meant for users, and the other being git rev-list
. The user-facing command is git cherry
, but I think here, the computer-oriented git rev-list
is actually much easier to explain.
The interesting thing here is how Git knows to compare all the patch IDs of commits C
and D
to the patch ID of commit Y
, when the initial graph is the one we showed above. Or, if the graph were:
...--A--E--F--G <-- master
\
C--D <-- dev
the two sets of patch-IDs to be compared would be those for (C, D) vs those for (E, F, G). If we had:
H--I
/ \
...--A--G---J--K <-- master
\
\ D
\ / \
C F <-- dev
\ /
E
the sets to compare would be (C, D, E, F) vs (G, H, I, J, K). And this, I think, makes the whole concept much clearer: the way this works is that git rev-list
can examine both branches down to the point where they meet, and collect a set of commits for each "side".
The way we get git rev-list
to do this is:
git rev-list --left-right master...dev
(note the three dots here, and the order of dev
and master
matters only to determine which commits are "left side"—on master
but not on dev
—and which are "right side", on dev
but not on master
).
This three-dot notation, along with this --left-right
idea, is how Git figures out which commits to put in which set, in order to compute the complete set of patch-IDs. If the patch ID of a left side commit is the same as the patch ID of a right side commit, those commits, though they are different in some way, represent the same change: they are cherries that have already been cherry-picked.
The git rebase
command skips over these pre-picked cherries. It only cherry-picks the remaining commits, whichever those are. The git rebase
code runs git rev-list --right-only --cherry-pick --no-merges <upstream>..HEAD
to get its list of commits to copy in step 2. Then it runs those three steps. If the patch ID of either C
or D
matches the patch ID for commit Y
, that commit is never copied at all. In this case, it would seem that commit D
's patch ID may match commit Y
's.
I also said that there are two ways that commits can be eliminated. This cherry detection is one of them, and I suspect it's the reason you see the result I think you see. The other is what Git calls --fork-point
, which is a bit complicated. I'm not going to cover it here. It only omits "early" commits: that is, it finds a point along the C--D
chain—this would make more sense if the chain were longer—that it thinks it should drop, and only copies commits after that point. Since commit C
gets copied, it cannot be the fork-point code causing commit D
to get omitted.
Upvotes: 7