cp.engr
cp.engr

Reputation: 2479

Git rebase commits not repeatable

When I have 2 branches pointing at the same commit, and then rebase them both onto the same new base commit, why do the rebased branches diverge?

I expected that they'd replay in the same way, and end up pointing at the same new commit.

touch a; touch b; touch c
git add a
git commit -m 'a'
git add b
git commit -m 'b'
git checkout -b branch-01 HEAD^
git add c
git commit -m 'c'
git checkout -b branch-02
git rebase master branch-01
git rebase master branch-02
git log --all --graph --decorate --pretty=oneline --abbrev-commit

Upvotes: 0

Views: 62

Answers (2)

cp.engr
cp.engr

Reputation: 2479

Why the Branches Diverged

Among the metadata used to calculate the hash for a git commit, not only is there an Author and an AuthorDate; there is also a Committer and a CommitterDate. This can be seen by running e.g.

git show --pretty=fuller branch-01 branch-02

Each rebase (or cherry-pick) command updates the committer date in the new commit(s) according to the current time. Since the two rebases in the question were performed at different times, their CommitterDates differ, thus their metadata differ, thus their commit hashes differ.

How To Move Branches/Tags Together

torek correctly notes that

if you want to move two or more names while copying some chain of commits, you'll be OK using git rebase to move the first name, but you will have to do something else—such as run git branch -f—to move the remaining names, so that they point to the commit copies made during the one rebase.

About Author vs Committer

From Difference between author and committer in Git?:

The author is the person who originally wrote the code. The committer, on the other hand, is assumed to be the person who committed the code on behalf of the original author. This is important in Git because Git allows you to rewrite history, or apply patches on behalf of another person. The FREE online Pro Git book explains it like this:

You may be wondering what the difference is between author and committer. The author is the person who originally wrote the patch, whereas the committer is the person who last applied the patch. So, if you send in a patch to a project and one of the core members applies the patch, both of you get credit — you as the author and the core member as the committer.

Upvotes: 0

torek
torek

Reputation: 487725

To explain what happened, try this as an experiment:

$ git checkout -b exp1 master
<modify some file; git add; all the usual stuff here>
$ git commit -m commit-on-exp1

At this point you have an experimental branch named exp1 with one commit that's not on master:

...--A--B   <-- master
         \
          C1   <-- exp1

Now we'll make an exp2 branch pointing to commit B, and copy commit C1 to a new commit C2 on branch exp2:

$ git checkout -b exp2 master
$ git cherry-pick exp1

The result is:

          C2   <-- exp2
         /
...--A--B   <-- master
         \
          C1   <-- exp1

Now let's repeat with exp3, creating it so that it points to commit B and then copying exp1 again:

$ git checkout -b exp3 master
$ git cherry-pick exp1

Do you expect exp3 to point to commit C2? If so, why? Why did exp2 point to C2 rather than to C1 like exp1?

The issue here is that commits C1 and C2 (and now C3 on exp3) are not bit-for-bit identical. It's true they have the same snapshot, the same author, the same log message, and even the same parent (all three have B as their one parent). But all three have different committer date-and-time-stamps, so they are different commits. (Use git show --pretty=fuller to show both date-and-time-stamps. Cherry-pick, and hence rebase too, copies the original author information including date-and-time, but because it's a new commit, uses the current date-and-time for the committer timestamp.)


When you use git rebase, in general, you have Git copy commits, as if by cherry-pick. At the end of the copying, Git then moves the branch name so that it points to the last copied commit:

...--A--B   <-- mainline
      \
       C--D--E   <-- sidebranch

becomes:

          C'-D'-E'  <-- sidebranch
         /
...--A--B   <-- mainline
      \
       C--D--E

Here C' is the copy of C that's changed to use B as its parent (and perhaps has a different source snapshot than C), D' is the copy of D, and E' is the copy of E. There was only one name pointing to E; that name is now moved, so there is no name pointing to E.

But if you have two names pointing to E originally, one of those two names still points to E:

          C'-D'-E'  <-- sidebranch
         /
...--A--B   <-- mainline
      \
       C--D--E   <-- other-side-branch

If you ask Git to copy C-D-E again, it does that—but the new copies are not C'-D'-E' because they have new date-and-time stamps. So you end up with what you saw.

Hence, if you want to move two or more names while copying some chain of commits, you'll be OK using git rebase to move the first name, but you will have to do something else—such as run git branch -f—to move the remaining names, so that they point to the commit copies made during the one rebase.

(I've always wanted to have a fancier version of git rebase that can do this automatically, but it's clearly a hard problem in general.)

Upvotes: 3

Related Questions