Reputation: 439
There are cases where I want to split an existing commit into two for the sake of code review or clearer version history. However, when splitting the commits, you'll find that certain inconsistencies appear or that you'll have to modify the new parent commit to maintain the build. However, I tend to have to do a lot of manual work to track/maintain the final code that I had written.
Is it possible to run a rebase without modifying the tree hash of the final commit.
Example we have 3 commits, A
, B
, and C
:
This is commit A
# test_file.py
# this is commit A
does_stuff()
This is a new commit B
# test_file.py
# this is commit B
does_stuff()
This is a new commit C
# test_file.py
# this is commit C
does_stuff()
where the original tree looked like
... -- C -- A
But we want to "split" A
into two commits
... -- C -- B -- A
When we create B
using an interactive rebase, A
's comment will also get updated to show # this is commit B
(assuming there's a few other changes other than just this one file). However, we want to keep the comment as is. Note: A
will have a different overall hash due to having a different parent, but its tree hash should remain identical to before.
Upvotes: 3
Views: 309
Reputation: 489173
It's not possible to modify any commit. Instead, you copy some existing set of commits to a new set of commits. This is actually good news, because it means the original commits remain available.
When we create
B
using an interactive rebase,A
's comment will also get updated to show# this is commit B
(assuming there's a few other changes other than just this one file).
In the specific example you showed, it shouldn't: you should get a merge conflict.
For other cases, of course, you're right.
However, we want to keep the comment as is. Note:
A
will have a different overall hash due to having a different parent, but its tree hash should remain identical to before.
Remember, you started with:
... -- C -- A
which I would draw as:
...--C--A <-- branchname (HEAD)
to indicate that some existing branch named branchname
points to commit C
, and HEAD
is attached to A
.
You then ran git rebase -i <hash-of-C>
or similar. This gives you a list of things to do, and you choose to "edit" A
. Git now:
detaches HEAD to the target of the rebase:
A <-- branchname
/
...--C <-- HEAD
Copies commit A
(using an exact copy / fast-forward, in this case, so that it re-uses A
itself; you can disable this if you like with --no-ff
, although in the end it makes no difference):
A <-- HEAD, branchname
/
...--C
or:
A <-- branchname
/
...--C--A' <-- HEAD
(using --no-ff
to force a copy).
At this point you would make some changes and run git add
and git commit --amend
to shove the current commit out of the way and make HEAD
point to the new commit B
, whose parent is C
. Let's say you did not use --no-ff
; the result is then:
A <-- branchname
/
...--C--B <-- HEAD
(If you did use --no-ff
, there's an additional A'
hanging out without a name; it will get garbage-collected in about a month. Then we'd have to call the next copy A"
to tell them apart, so let's assume you didn't use --no-ff
.)
Now you want to get the files from commit A
, and the commit message from commit A
, and make a new commit. Since branchname
still points to original commit A
, just do that:
$ git checkout branchname -- . # assumes you're at the top level of your repo
$ git commit -C branchname # or -c if you want to edit it again
Now you have:
A <-- branchname
/
...--C--B--A' <-- HEAD
At this point you finish off the rebase with git rebase --continue
. Since there are no commits left to copy—you have finished copying the last commit, A
, as far as rebase is concerned—this does the last step of a rebase, which is to peel the branch name off the original commit chain and move it to point to the same commit as HEAD
, while re-attaching HEAD
:
A <-- ORIG_HEAD
/
...--C--B--A' <-- branchname (HEAD)
As a side effect, rebase sets ORIG_HEAD
to remember where branchname
used to point, so it's easy to make sure everything worked correctly and you ended up in the desired state:
git diff ORIG_HEAD HEAD
and if that's wrong you can git reset --hard ORIG_HEAD
, resulting in:
A <-- branchname (HEAD)
/
...--C--B--A' <-- ORIG_HEAD
Note that other commands, including git reset
, set ORIG_HEAD
(which is why they swapped here). Eventually one of these two commits will be abandoned entirely except for reflog entries, and when those expire, the unreachable commits will truly go away. The default expiry for such commits is once they are 30 days old.
Upvotes: 3