Will the history be gone after rebasing in git

Question

Suppose i have master branch with three commits

A--B----C

I create new branch dev and create 3 commits there

A--B----C------C1------C2
         \
          E---F----G

now i merge with master like this

git merge dev --squash

Now i think the new diagram is like this but not sure

A--B----C------C1------C2--------M----
         \                      /    
          E---F----G-----------/

After again i need to merge dev branch with master to get commmit c1 and c2

so it will be like this

A--B----C------C1------C2--------M----
         \                      / \   
          E---F----G-----------/----m2 ---H--I

dev and master will go in parallel

I want to know that

Will the individual commit history will be there in dev branch after squasing for commits E F G
I want to know what will be the diagram if i ahve to use rebase instead of merge where both branches should go in parallel
If answer to first question is yes then will individual history commits be gone if i use rebase command

torek · Accepted Answer

Let me re-draw the second diagram with some labels:

A--B--C-----C1----C2    <-- master
       \
        E--F----G    <-- dev

(I assume C1 and C2 just represent a few more commits that occurred on master while E, F, and G were being added to dev).

If you now are on master and run git merge dev --squash (and then commit the result: --squash suppresses the final commit step), you get this, which is very different from what you drew:

A--B--C-----C1----C2---M    <-- master
       \
        E--F----G    <-- dev

You will get what you drew if you run git merge without --squash. The --squash flag will, to quote the documentation:

Produce the working tree and index state as if a real merge happened (except for the merge information), but do not actually make a commit or move the HEAD, nor record $GIT_DIR/MERGE_HEAD to cause the next git commit command to create a merge commit.

In other words, a squash merge is not a merge at all (this is why I have, er, "issues" with some of git's terminology :-) ... some checkouts are checkouts, and some are not; some merges are merges, and some are not; branches are just branch tips except when they're not, and so on!). A "squash merge" does use the underlying merge code to figure out what the contents of the next commit should be, but if and when you make the actual commit (the one labeled M above), it's just an ordinary non-merge commit, that takes the work done—the changes made—in commits E through G and copies those changes into whatever was the latest version in C2. Think of it as "someone sends you a diff, comparing version C to version G, and then you manually insert all those changes into version C2 and commit that (which should probably be called C3 but we called it M instead)".

If you want a real merge, do a merge without --squash. That will in fact give you what you drew, which I will re-draw here:

A--B--C-----C1----C2---M    <-- master
       \              /
        E--F----G----/      <-- dev

(The label dev still points to commit G.)

Next, if you want to pick up the changes in C1 and C2 and put them on dev, you can do one of two things:

git checkout dev && git merge master

or:

git checkout dev && git merge --no-ff master

The difference here is whether you get the commit m2 you drew. Without --no-ff, git observes that the merge of dev and master produces commit M, and just moves the dev label to point to M:

A--B--C-----C1----C2---M    <-- dev, master
       \              /
        E--F----G----/

But with --no-ff, the merge must produce a new, different merge commit (the same tree, but a different commit nonetheless, with different parent-age):

A--B--C-----C1----C2---M    <-- master
       \               X
        E--F----G------m2   <-- dev

Here merge M has C2 as its first parent and G as its second parent. This indicates that it's a merge of G2 (the "non-first" parent) into master (the first parent). Merge m2, on the other hand, has G as its first parent and C2 as its second parent.

In other words, the trees (working directories) are the same, the set (list-ignoring-order) of parents is the same, and even the commit messages and timestamps could be the same: but the order of the parents is different, and that alone is sufficient to guarantee that these will be different commits.

The title of your question is about rebasing, rather than merging. Rebase is a different operation entirely. Let's say you're on branch dev and run git rebase master.

To do a rebase, what git does is (in essence—there's some optimization applied, and some corner cases with merge commits) simply "cherry pick" each commit along some branch, and apply those "cherries" to some other branch. Let's go back to the first diagram:

A--B--C--C1--C2    <-- master
       \
        E--F--G    <-- dev

The idea here is to "pick" the commits E, then F, then G and apply them in sequence. Let's start by copying E to E', applied on top of C2:

                E'
               /
A--B--C--C1--C2    <-- master
       \
        E--F--G    <-- dev

Next we copy F to F' like this:

                E'--F'
               /
A--B--C--C1--C2    <-- master
       \
        E--F--G    <-- dev

There's one more commit to copy, G to G':

                E'--F'--G'
               /
A--B--C--C1--C2    <-- master
       \
        E--F--G    <-- dev

We're all done except for one thing: we need a label for the new chain of commits. Let's call it dev! Oops, wait, we already have a label dev. Well, let's just peel off the old dev label, and paste it on the new G':

                E'--F'--G'   <-- dev
               /
A--B--C--C1--C2    <-- master
       \
        E--F--G

Voila, git rebase is done. That's what it does: copy the old commits to new ones.

What happens to the old ones? Well, they're what I call "abandoned". They are still there in your repository, but the only labels for them are stored only in the "reflogs". The reflog entries will keep them around for a default expiration timeout of 30 days or so. (It's configurable, and there are actually two different timeouts, but it's best to just think of this as "a month or so".) Thus, in about a month, git will garbage-collect the abandoned commits.

Sometimes rebasing is "bad", in that it creates more work for other people who are sharing your git repo (either directly, or more likely, at some central repository you push your work up to). Suppose someone else has commits E through G, and they did more work and made commit H for instance on top of G. Then you go and copy E through G, and abandon the original E through G. You've now made their work, commit H, require yet more work: they may have to rebase their H on your G', for instance.

If nobody else has your commits E through G, you can be sure that you have not made any extra work for the nonexistent other people. A rebase will only affect you, so most of the "bad" goes away.

In the end, there's no single answer to rebase vs merge (and if merge, whether and when to use --no-ff) vs "squash merge and abandon original development chain" vs whatever-else-you-can-come-up-with. The whole idea of making commits in git, and manipulating them with merges and cherry-picking and rebasing and so on, is to make it easier to maintain the work. All the fiddling with commit graphs is pointless unless it makes maintaining the real work easier.

Will the history be gone after rebasing in git

Answers (2)

Related Questions