francesco
francesco

Reputation: 7539

git rebase squash messes up branches history

I have a git repository with a long history and various branches, which have been merged into master. However, the first commits simply stem from the first one, without any branching.

I would like to squash together some of these first commits. To do that I did

git rebase -i --root

then in the editor I selected the commits to squash, then changed the commit message for the squased commits. git rebase seems to work fine, but to my surprise, git log shows that the history has been messed up, losing the history of branch merging: several commits appear now duplicate, and branches are unmerged.

Although it is difficult to replicate the full problem, here I provide a minimal example that shows a similar issue. Consider a git local repository created by the following commands

git init .
touch foo.txt
git add foo.txt
for (( i=1; i<4; i++ )); do echo hello $i >>foo.txt ; git commit -a -m $i; done
git checkout -B test
for (( i=1; i<4; i++ )); do echo hello $i >>foo.txt; git commit -a -m test$i; done
git checkout master
git merge test
for (( i=4; i<5; i++ )); do echo hello $i >>foo.txt ; git commit -a -m $i; done

This creates a repository, after some commits a new branch test, which is then merged into master (simple fast-forward), and finally a last commit on master. The log of the repository is (Author/Date removed)

git log --graph --all

* commit ba326e1ada9525fd2b1b275c597ad189b7cf3ddf (HEAD -> master)
| Author: 
| Date:   
| 
|     4
| 
* commit b739356170f946f3d1b3c3cb3299f4379212ecc6 (test)
| Author: 
| Date:   
| 
|     test3
| 
* commit 303e95fd23beb9c5a96c631667dfd2be7e924390
| Author: 
| Date:   
| 
|     test2
| 
* commit dd67e3ca5f12e35c736483bcc8335347dc78da87
| Author: 
| Date:   
| 
|     test1
| 
* commit 2420498ae794fa900dd9fee8296b7561e08028b8
| Author: 
| Date:   
| 
|     3
| 
* commit 2d43d89d2b75566f37a99b7cce294cddcd3fbf7a
| Author: 
| Date:   
| 
|     2
| 
* commit 487e5158e461479f848eb7313dc87090f197f4a7
  Author: 
  Date:   

      1

Now, I squash the commits "2" and "3" into a single one. I do

git rebase -i --root

and in the editor I enter

pick 487e515 1
pick 2d43d89 2
squash 2420498 3
pick dd67e3c test1
pick 303e95f test2
pick b739356 test3
pick ba326e1 4

In the commit message for the squashed commit I enter "squash 2 3". git rebase seeems to finish well:

[detached HEAD b5d1682] squash 2 3
 Date: Mon May 13 12:01:19 2019 +0200
 1 file changed, 2 insertions(+)
Successfully rebased and updated refs/heads/master.

However, to my surprise the history is a mess now

git log --graph --all

* commit 463cbb99f553d555fadbe05a3cc90aeb46b0bbce  (HEAD -> master)
| Author: 
| Date:   
| 
|     4
| 
* commit 194e78fc74d4eaea62a8b903baa6379dfc8578d3
| Author: 
| Date:   
| 
|     test3
| 
* commit 8f5e0be6dd8288790b2893e9b3fa97e5e1021134
| Author: 
| Date:   
| 
|     test2
| 
* commit 1060aea0f3da5dfee8e37adc68d82d77e7c02ba4
| Author: 
| Date:   
| 
|     test1
| 
* commit b5d168283be820fed3b9862576c44c054402ab50
| Author: 
| Date:   
| 
|     squash 2 3
|   
| * commit b739356170f946f3d1b3c3cb3299f4379212ecc6 (test)
| | Author: 
| | Date:   
| | 
| |     test3
| | 
| * commit 303e95fd23beb9c5a96c631667dfd2be7e924390
| | Author: 
| | Date:   
| |
| |     test2
| | 
| * commit dd67e3ca5f12e35c736483bcc8335347dc78da87
| | Author: 
| | Date:   
| | 
| |     test1
| | 
| * commit 2420498ae794fa900dd9fee8296b7561e08028b8
| | Author: 
| | Date:   
| | 
| |     3
| | 
| * commit 2d43d89d2b75566f37a99b7cce294cddcd3fbf7a
|/  Author: 
|   Date:  
|   
|       2
| 
* commit 487e5158e461479f848eb7313dc87090f197f4a7
  Author: 
  Date:   

      1

git rebase has split again the merged test branch, while the master branch retains the history. However, I have also lost the information that test was merged in master.

I expected a history like that:

git log --graph --all

* commit ba326e1ada9525fd2b1b275c597ad189b7cf3ddf (HEAD -> master)
| Author: 
| Date:   
| 
|     4
| 
* commit b739356170f946f3d1b3c3cb3299f4379212ecc6 (test)
| Author: 
| Date:   
| 
|     test3
| 
* commit 303e95fd23beb9c5a96c631667dfd2be7e924390
| Author: 
| Date:   
| 
|     test2
| 
* commit dd67e3ca5f12e35c736483bcc8335347dc78da87
| Author: 
| Date:   
| 
|     test1
| 
* commit 2420498ae794fa900dd9fee8296b7561e08028b8
| Author: 
| Date:   
| 
|     squash 2 3
|
* commit 487e5158e461479f848eb7313dc87090f197f4a7
  Author: 
  Date:   

      1

Is this a bug of git rebase? I am using git 2.19.2

If I try without the branch test or if I delete the branch test before rebasing, it works fine and produce the desired history, but still I lose the information on the test branch.

Moreover, in more complex cases, where test is merged without a simple fast-forward, deleting the test branch before a squash with git rebase produces apparently a full loss of the merged branch history.

The following example illustrates this point.

git init .
touch foo.txt
git add foo.txt
for (( i=1; i<4; i++ )); do echo hello $i >>foo.txt ; git commit -a -m $i; done
git checkout -B test
for (( i=1; i<4; i++ )); do echo hello $i >>foo.txt; git commit -a -m test$i; done
git checkout master
touch foo2.txt
git add foo2.txt
git commit -a -m 4
git merge test
git branch -D test

Now the log history is

git log --graph --all

*   commit e905116fed7f4d52c65da46ab6172ae7a08e824a (HEAD -> master)
|\  Merge: 77cc483 584a037
| | Author: 
| | Date:   
| | 
| |     Merge branch 'test'
| | 
| * commit 584a03768822ebd92d4feee20ebe238fffd89c25
| | Author: 
| | Date:   
| | 
| |     test3
| | 
| * commit 7a2ad09b5c39e5076cd22fa957c2d539e37c0861
| | Author: 
| | Date:   
| | 
| |     test2
| | 
| * commit 11ef4fea5ba207a637a85d8e8456f48d0c7bd7ab
| | Author: 
| | Date:   
| | 
| |     test1
| | 
* | commit 77cc4833cf5aac84aca9737945fd79a7632019ac
|/  Author: 
|   Date:   
|   
|       4
| 
* commit 081792ccf9b4714ab4bce23e4e7b126647eeead8
| Author: 
| Date:   
| 
|     3
| 
* commit 971f217200f7e485308b861033b5b31b7ae69d1a
| Author: 
| Date:  
| 
|     2
| 
* commit cef186ae9ad0e316d82c62c2082381747f25a443
  Author: 
  Date:  

      1

Merging together the commits 2 and 3 as above, produces the a repository with the following log

git log --graph --all

* commit 3a4010551d50a47a9db6d53a4597770fa2517d92 (HEAD -> master)
| Author: 
| Date:   
| 
|     test3
| 
* commit bbbc747780c847b3d40ed7557ed514e5e4dd9fc2
| Author: 
| Date:  
| 
|     test2
| 
* commit 6f54fd90555fbe6730fcfe7d85761b6477380214
| Author: 
| Date:   
| 
|     test1
| 
* commit 71fa7371198a0cbbf4793dc27ffb27ac65d15096
| Author: 
| Date:  
| 
|     4
| 
* commit 3032069c375ef37b42af75c97759d7a821f1139f
| Author: 
| Date:   
| 
|     s 2 3
| 
* commit cef186ae9ad0e316d82c62c2082381747f25a443
  Author:
  Date:  

      1

The logs shows that I have lost the history of branching and merging of the test branch. If did not delete the test branch after merging it into master, I obtain a similar results as the first example, with a split branch test.

Upvotes: 2

Views: 1782

Answers (2)

francesco
francesco

Reputation: 7539

After realising that git rebase is not the right tool, I have devised a solution based on git filter-branch. The idea is to alter the commit "2" adding the content of commit "3" via a patch. Then, commit "3" becomes empty, hence it can be eliminated.

Considering the second example.

git diff 971f217200f7e485308b861033b5b31b7ae69d1a \
         081792ccf9b4714ab4bce23e4e7b126647eeead8 \
         >patch.txt
git filter-branch --tree-filter '\
   if [ "$GIT_COMMIT" == "971f217200f7e485308b861033b5b31b7ae69d1a" ]
   then
       git apply /localdirectory/patch.txt
   fi' \
   --prune-empty -- --all

Since git filter-branch works in a temporary subdirectory .git-rewrite/t, the git apply command requires a full path for the patch file.

This leaves some old refs. After checking that the git log is correct, a cleanup can be done

git update-ref -d refs/original/refs/heads/master

This gives a repository with log

git log --graph --all

*   commit e905116fed7f4d52c65da46ab6172ae7a08e824a (HEAD -> master)
|\  Merge: 77cc483 584a037
| | Author: 
| | Date:   
| | 
| |     Merge branch 'test'
| | 
| * commit 584a03768822ebd92d4feee20ebe238fffd89c25
| | Author: 
| | Date:   
| | 
| |     test3
| | 
| * commit 7a2ad09b5c39e5076cd22fa957c2d539e37c0861
| | Author: 
| | Date:   
| | 
| |     test2
| | 
| * commit 11ef4fea5ba207a637a85d8e8456f48d0c7bd7ab
| | Author: 
| | Date:   
| | 
| |     test1
| | 
* | commit 77cc4833cf5aac84aca9737945fd79a7632019ac
|/  Author: 
|   Date:   
|   
|       4
| 
* commit 971f217200f7e485308b861033b5b31b7ae69d1a
| Author: 
| Date:  
| 
|     2
| 
* commit cef186ae9ad0e316d82c62c2082381747f25a443
  Author: 
  Date:  

      1

At this point the task is essentially done. Still, it could be useful to change the message for the commit "2". This can be done with the msg-filter, for instance:

git filter-branch --msg-filter 'sed "s/^2/2 and 3 together/"' -- --all
git update-ref -d refs/original/refs/heads/master

In case there are more branches, like in the first example, one needs to eliminate also the corresponding old refs, e.g.:

git update-ref -d refs/original/refs/heads/test

Upvotes: 1

kowsky
kowsky

Reputation: 14449

tl;dr: rebase only rebases a single branch.

A rebase is only performed in the current branch (or the one specified by argument). Thus, your results are just as expected, since you only rebase the master branch, and the test branch remains the same, i.e. its HEAD points to your old commits.

If you read the git rebase doc carefully, you will see that it always talks about the current branch.

If you want other branches pointing to the rebased commits, you will have to reset them. In this case, checking out test branch and using git reset --hard 194e78fc74d4eaea62a8b903baa6379dfc8578d3 would lead to the result you expected. (As always, be careful using reset --hard, as it will delete uncommited changes.)

I do not understand what information you fear to loose, since all changes of branch test are present in the rebased history of master.

Upvotes: 2

Related Questions