user5047085
user5047085

Reputation:

How/why does git rebase avoid a merge commit?

The git rebase command uses a merge under the hood, so I am wondering how/why a git rebase avoids a merge commit whereas git merge usually creates a new commit if there is a diff between branches.

Upvotes: 7

Views: 6896

Answers (5)

Schwern
Schwern

Reputation: 165606

rebase is not a merge like merging a branch. Instead, rebase is a series of git cherry-picks. A cherry pick takes a commit and copies it as if it were written on top of some other commit. It's like if you took the diff of a commit and applied it somewhere else.

Consider a repository like this. A, B, C, etc... represent commit IDs.

A - B - C - D [master]
         \
          E - F - G [feature]

If master needs E, maybe it's a critical bug fix, but not the rest of feature, it can git cherry-pick E. Git will do a merge between D and E with C as the origin resulting in E1. This is why you might get merge conflicts during a rebase. But unlike a normal merge, E1 will only have one parent.

A - B - C - D - E1 [master]
         \
          E - F - G [feature]

A rebase is like if did this for all your commits. And instead of being on top of master it moves feature. Here's what git rebase master looks like, step by step.

Again, consider your repository.

A - B - C - D [master]
         \
          E - F - G [feature]

First, as before, it does a cherry-pick on E and puts it on top of master. But instead of moving master, it creates a new branch with no label.

              E1
             /
A - B - C - D [master]
         \
          E - F - G [feature]

Then it cherry-picks F and puts it on top of E1.

              E1 - F1
             /
A - B - C - D [master]
         \
          E - F - G [feature]

Then it cherry picks G and puts it on F1 as G1.

              E1 - F1 - G1
             /
A - B - C - D [master]
         \
          E - F - G [feature]

With the cherry-picking complete it moves the feature branch label to G1.

              E1 - F1 - G1 [feature]
             /
A - B - C - D [master]
         \
          E - F - G

And yes, the old commits are still there. If nothing references them, like a tag or another branch, they will be garbage collected in a few weeks. Meanwhile you can still access them with git reflog. See the Maintenance and Data Recovery chapter of Pro Git for more.

Upvotes: 10

zzxyz
zzxyz

Reputation: 2981

A lot of people have answered how rebase avoids a merge commit, but I think why is a bit interesting. There are tons of things you can do with it, so this is just my opinion on the most important one. (Others with different workflows will naturally disagree :)

In a centralized repository system, such as svn or perforce, commits are done straight to the server. So if Bob and Jim are both working on their local copies of the main branch...they are both working on the main branch. End of story. After Bob and Jim do their commits, we see them, in order...in one branch.

One of the very few potential drawbacks to git is that a remote repository is effectively an implicit branch. It behaves like one, it shows up in git log like one, etc. So if Bob and Jim both make changes to their local copy, the last one to push has to pull...and do a merge from the remote. And that shows up as if Bob and Jim had both been working on their own branches. Sometimes this behavior is perfect. Sometimes it is undesirable.

git pull --rebase/git fetch;git rebase is how you can tell git "I know you consider this a branch, but I don't. I just want to do my commit on top of the last one like a normal person and have what I consider an accurate branch history." I personally don't care much about clean.

Upvotes: 0

Dietrich Epp
Dietrich Epp

Reputation: 213837

You are correct that a rebase is basically a merge under the hood. In fact, you can make your own rebase command using the merge command, and all you need to do is modify some commit metadata afterwards.

Let’s say this is your history:

A - B - C (master)
 \
  D - E (dev)

If you want to rebase dev onto master, you can do it by first merging D and then E,

A - B - C - D'- E'
 \         /   /
  + - - - D - E

And then you modify the commits to remove the links from D and E,

A - B - C - D"- E" (master)
 \
  D - E (dev)

So you can see that git rebase is nothing more than some straightforward history manipulation (only changing metadata) on top of a merge. The reason you don’t see merge commits is simply a question of metadata—each commit has pointers to its parent commits, and if you delete all but one of the pointers, you erase a commit’s connection to those parents, turning a merge commit into a non-merge commit.

More complicated rebases can be done this way, too. Note that this is not how rebase actually works, it’s just a way that you can manually rebase commits without using the rebase or cherry-pick commands. The important part is that a merge commit is nothing more than a commit with two or more parents, and you can change any commit’s parents just by modifying the metadata.

Upvotes: -2

torek
torek

Reputation: 490098

I like to split the concept of merging, in Git, into two parts:

  • The verb to merge means to combine changes. Many commands use this.

  • The noun a merge, derived from the adjective form merge commit where the adjective "merge" modifies the noun "commit", means a commit with two or more parents. Only git merge builds such commits (well, git merge or things that run git merge, including git pull).

What this means is that git rebase uses the verb form, the to merge operation, which generally occurs via repeated git cherry-pick.1 Git cherry-picks a commit by performing a merge with the commit's parent2 as the merge base, the commit itself as the "other" commit, and the current HEAD commit playing its normal role. zzxyz's comment is thus correct: each cherry-pick operation can result in merge conflicts, which a human must resolve.

The result of each cherry-pick operation is a copy of each commit, as Schwern noted in his answer. After copying the commits of interest, Git moves the branch name so that it points to the final copied commit—the tip of the new branch in its new position in the commit graph. The original commits remain for at least a little while due to ORIG_HEAD—which rebase sets before moving the branch name—and due to various reflogs (for HEAD itself and/or for the branch name that git rebase just changed).

The last thing worth noting here is this phrase the commits of interest in the preceding paragraph. When rebase lays out the commits it will copy, it deliberately discards merges by default. There are arguments to be made on either side of this ("keep merges" vs "discard merges", that is), but that's what Git does today. So after rebasing, the copies never include any merge commits, even if the process of copying commits involves the to merge verb.


1An interactive git rebase -i, or a git rebase run with -m or -s strategy, really uses git cherry-pick. A non-interactive rebase without -m or -s uses git format-patch and git am -3, which is technically inferior in several ways, but is backwards compatible with ancient Git 1.5.

2This is what makes cherry-picking a merge difficult: a merge—a merge commit, which by definition is a commit with two or more parents—does not have a parent, it has two or more parents. The git cherry-pick command will let you pick it anyway, but to do so, you must specify which parent the command should treat as the merge base during the merge-as-a-verb process.

Upvotes: 3

bredikhin
bredikhin

Reputation: 9045

You can think of it this way: git rebase takes the new commits from your branch and set them aside, then rewinds your branch to match the updated original one, then finally adds your commits on top of the result; git merge simply creates a new (merge) commit with the difference between the branches. Note that in the case of rebasing all the commits that were not present on the original branch will change their hashes, so you're gonna have to force push the new branch (which is not always easy if there are multiple people collaborating on the same branch). Rebasing, however, lets you avoid merge commits and keep cleaner history.

Upvotes: 2

Related Questions