Reputation: 1399
After a lot of search over the internet, i know this have been discussed in length. And i think i get it now. But i want to confirm, if the major difference in using git-rebase
Vs git-merge
is with the linear/clean history with git-rebase
as against tractability of git-merge
? Or is there anything else that the team will miss out, if we always use git-merge
?
There were also recommendation, that if the team is not aware of the intricacies of rebase
, go for the merge
Upvotes: 0
Views: 900
Reputation: 487725
There are many items to consider. I'm not sure this will be a complete list ("complete" is hard).
Merge is conceptually simple, yet full of tricky corner cases. To understand it well, you must understand how Git's commit graph works.
There are several parts to each merge:
--ours
) and R ("right" or "remote" or "other" or --theirs
). These commits contain complete snapshots of the source, as all Git commits do.Combining these two changesets. Where the base-to-L change affects file F, if base-to-R does not affect F at all, we're fine. If base-to-R does affect F, are the changes to F in different areas of F, or are they in the same location? If they are in the same location, are they the same change? If so, take one copy of the change. If they're in different locations, combine both changes.
In other words, if only one of L or R affects F, take the L or R version wholesale. Otherwise, combine the changes. If any of these changes conflict, declare a merge conflict. Repeat for all files in the two change-sets, and keep all totally-unmodified files the same as the are in the base.
The result is the combined set of changes, and as long as there were no conflicts, Git can now make a new commit. The new commit can be a merge commit, i.e., record both parents L and R, so that the commit graph records the act of merging. This is the normal case for a normal (real) merge.
Besides the above, Git has what it calls fast-forward merges (which are not merges at all), and also squash merges (which do the merge work of combining changes, but then make an ordinary, non-merge commit). This means that aside from merge itself being conceptually simple (but with all the above corner case issues), you need to recognize that Git has both merge-as-a-verb, which is this change-set-combining thing, and merge-as-a-noun, where "a merge" means "a merge commit": a commit that records its two parents.
Before you can understand rebase, you need to understand cherry-picking. This also requires understanding how Git's commit graph works. Each cherry-pick copies a commit, by computing a change-set from the commit's snapshot. The change-set is simply the diff of the parent of the commit, vs the commit itself ("what changed in this commit?"). Git then applies this change-set to some other commit, using the same merge process—merge as a verb—as for git merge
. The merge base of a cherry-pick is a little bit confusing, but in practice this mostly just works.
Rebase consists of automated cherry-picking of some set of commits to copy, followed by a branch-label movement, so as to abandon the original (pre-copy) commits in favor of the new (copied) commits.
Thus, to properly understand rebase, you need to understand cherry-pick, which means you need to understand merge. You also need to know how Git's branch labels work, which is something you need to know to understand git reset
and git branch -f
, and what it means to force-push. To understand how Git chooses the set of commits to copy, you need to understand Git's commit graphs—but you need this to understand how git merge
finds the merge base, so although that might seem difficult and/or scary, it's something you already know by this point.
In the end, it's not really all that difficult: there are just several very large graph-theory-related hurdles that you have to clear nearly simultaneously. It is, however, undeniably true that git rebase
is more complex that git merge
, if only because git rebase
uses most of git merge
via git cherry-pick
.
Because rebase copies commits (and then abandons the originals in favor of the new copies), it's usually best to avoid doing this anywhere people might care about the hash IDs of the original commits. (Commits are uniquely identified by their hash IDs.) As a short-cut, you can use a very simple concept: if a commit is unpublished, only you know its hash ID, so therefore no one else can possibly care about its hash ID.
In a typical setup, this means if you have not used git push
to publish your commits, it is safe to use git rebase
on them; but if you have used git push
, it's not so safe: now everyone else who has them (because you pushed them) has to know how to handle the copy-and-abandon thing.
Upvotes: 1