Nimesh Doshi
Nimesh Doshi

Reputation: 11

Inconsistent result for similiar git rebase cases

I have observed that when the changes are made in the same file in different branches, and then if we try rebase and merge these branches one after the other into the mainline, it passes sometimes or fails sometimes. Result are not consistent. For e.g. Case(1): - I have a file in mainline - test.txt

a
b
c
d

I have created two branches b1 and b2 from mainline In branch b1, I edited and committed these changes - test.txt

a
b1
c

And in branch b2 -> test.txt

a
b
c1
d

Then, I did 'git rebase mainline' on b1 and 'git merge b1' on mainline So far good. Now, when i do 'git rebase mainline' on branch b2, it FAILS and asks me to resolve the conflict first.

Case(2): - I have file test2.txt in mainline.

MEMBER xe-1/1/1 REMPORT xe-1/1/1
MEMBER xe-2/2/2 REMPORT xe-2/2/2
MEMBER xe-3/3/3 REMPORT xe-3/3/3
MEMBER xe-4/4/4 REMPORT xe-4/4/4

MEMBER xe-11/11/11 REMPORT xe-11/11/11
MEMBER xe-21/21/21 REMPORT xe-21/21/21
MEMBER xe-31/31/31 REMPORT xe-31/31/31
MEMBER xe-41/41/41 REMPORT xe-41/41/41

I have created branch c1 and c2 from mainline. In c1, i have edited and committed test2.txt as: -

MEMBER xe-1/1/1 REMPORT xe-1/1/1:1
MEMBER xe-2/2/2 REMPORT xe-2/2/2:1

MEMBER xe-11/11/11 REMPORT xe-11/11/11
MEMBER xe-21/21/21 REMPORT xe-21/21/21
MEMBER xe-31/31/31 REMPORT xe-31/31/31
MEMBER xe-41/41/41 REMPORT xe-41/41/41

And in branch c2, I have edited and committed test2.txt as: -

MEMBER xe-1/1/1 REMPORT xe-1/1/1
MEMBER xe-2/2/2 REMPORT xe-2/2/2
MEMBER xe-3/3/3 REMPORT xe-3/3/3
MEMBER xe-4/4/4 REMPORT xe-4/4/4

MEMBER xe-11/11/11 REMPORT xe-11/11/11:1
MEMBER xe-21/21/21 REMPORT xe-21/21/21:1
MEMBER xe-31/31/31 REMPORT xe-31/31/31:1
MEMBER xe-41/41/41 REMPORT xe-41/41/41:1

Then, I did 'git rebase mainline' on c1 and 'git merge c1' on mainline. So far good. Now, when I do 'git rebase mainline' on c2, it PASSES. and final content on mainline after 'git merge c2' looks like this: -

MEMBER xe-1/1/1 REMPORT xe-1/1/1:1
MEMBER xe-2/2/2 REMPORT xe-2/2/2:1

MEMBER xe-11/11/11 REMPORT xe-11/11/11:1
MEMBER xe-21/21/21 REMPORT xe-21/21/21:1
MEMBER xe-31/31/31 REMPORT xe-31/31/31:1
MEMBER xe-41/41/41 REMPORT xe-41/41/41:1

which is as per my expectation but I am trying to understand why Case(1) failed while Case(2) passed. What's the algorithm 'git rebase' followed here?

Upvotes: 0

Views: 96

Answers (1)

torek
torek

Reputation: 488183

To properly understand what is going on here, you need to know several things:

  1. Rebase is essentially repeated cherry-pick operations.
  2. Cherry-picking a (single) commit is really a three-way merge. More precisely, it is the to merge form of merge—merging as an action, a verb. (Git's git merge implements many things, including the to merge step followed by a commit that produces a merge commit, i.e., merge treated as an adjective. The cherry-pick operation never produces a merge commit, but does make use of the merge-as-a-verb process.)
  3. Merging as a verb, doing a three-way merge, involves comparing a (single) merge base commit to two commits-of-interest.

The merge action makes the most sense when you consider how it was originally designed. Suppose that there are two people modifying something—for simplicitly, assume the something in this case is just a single file. Let's call the two people A (Alice) and B (Bob).

They start with a common base version of that file. Alice makes some change(s) to the file, and Bob makes some changes to the file. Eventually, someone—Alice, Bob, or maybe even a third person C (Carol?) must combine their changes.

In Git, to combine these changes, we have Git figure out which file they both started with, and then compare that base version to both of their latest versions. The base version is the version of the file that went into the base commit:

             o--o--A   <-- Alice
            /
...--o--o--*
            \
             o--o--B   <-- Bob

Git can simply run two git diff commands:

git diff --find-renames <hash-of-*> <hash-of-A> > /tmp/alice
git diff --find-renames <hash-of-*> <hash-of-B> > /tmp/bob

Git can then extract the contents of commit *, apply Alice's changes, apply Bob's changes, and use the final result as the merged result. Note that this handles all changes to all files (one file at a time).

If Alice and Bob touch the same lines in any one file, there is, of course, a merge conflict. This means we have to define what it means for lines to be "the same", but it tends to be pretty clear. Had you been doing a simple merge, this is what you would see in your "case 1": Alice changed:

a
b
c
d

to:

a
b1
d

so she deleted two lines, b and c, and added one line, b1, after a before d.

Bob, meanwhile, changed the same original input to:

a
b
c1
d

That is, Bob kept b, deleted c, and added c1 between the kept b and the kept d. These changes overlap, so they obviously conflict.

The above is all fine and good for a git merge that does the merge-as-a-verb followed by the merge-as-an-adjective commit (or in this case, stops with a merge conflict and makes you finish the job). But what about cherry-picking?

To understand rebase and its cherry-picking, start by once again, drawing the commit graph. We have some series of commits, to which the name mainline points:

...--A--B--C   <-- mainline (HEAD)

You then create a new branch b1 pointing to commit C:

...--A--B--C   <-- mainline, b1 (HEAD)

then change a file and commit the new snapshot:

...--A--B--C   <-- mainline
            \
             D   <-- b1 (HEAD)

You also create a branch b2 pointing to commit C, and change a file and commit the new snapshot:

             E   <-- b2 (HEAD)
            /
...--A--B--C   <-- mainline
            \
             D   <-- b1

(The name HEAD is always attached to the current branch, in these drawings.)

Then, I did 'git rebase mainline' on b1 and 'git merge b1' on mainline

That is, you ran git checkout b1 && git rebase mainline. This does nothing at all, because commit D on b1 is already in the right place. Then you did git checkout mainline && git merge b1. This does a fast-forward, which is not actually a merge at all: it really just amounts to checking out commit D while moving the name mainline. The result is this graph:

             E   <-- b2
            /
...--A--B--C
            \
             D   <-- b1, mainline (HEAD)

So far good. Now, when i do 'git rebase mainline' on branch b2, it FAILS and asks me to resolve the conflict first.

If you git checkout b2 && git rebase mainline, Git finds that commit E does not come after commit D. It comes, instead, after commit C. So Git must now copy commit E to a new commit. This copying is a git cherry-pick.

Cherry-pick is often described as "diff the commit against its parent to get a patch, then apply that patch" to the current commit. For simple cases, this is close enough—and in fact, git cherry-pick at one time did do just that. For more complex cases, however, what git cherry-pick does is to run the to merge action, but with the merge base set to the parent of the commit being picked.

In this case, the parent of E is C, so this is the cherry-pick "merge base". Git now runs two git diffs: one compares the merge base C to D, and the other compares the merge base C to E. It's now quite clear that this is the same situation we had when we were merging Alice's and Bob's changes. The changes overlap, so Git raises a merge conflict.

The process for resolving the merge conflict is the same as for git merge: you have all three versions of each conflicted file available in Git's index, plus Git's best attempt to do the merge on its own, stored in the work-tree. This work-tree version has merge conflict markers. You must resolve the conflict—any way you like—and update the index so that it has the final version of the file. For instance, you can just edit the work-tree file into shape, then use git add to copy the work-tree version into the index.

You then run git rebase --continue to finish the cherry-pick that the rebase is doing, and let the rebase continue cherry-picking more commits if there are any. In this case, there are none, so rebase finishes its work by moving the branch name to point to the copied commit:

             E   [abandoned]
            /
...--A--B--C
            \
             D   <-- b1, mainline
              \
               E'  <-- b2 (HEAD)

You now have all the tools to understand your second rebase. Look at which commits get copied. Note which commit acts as the merge base. Run the two git diff commands, to compare the merge base commit to the two other commits. Do the changes in the two diff outputs conflict? If not, what result do you get when applying both changes to the merge-base version of the file? Is that the same as what Git got?

Upvotes: 1

Related Questions