Mezuzza
Mezuzza

Reputation: 439

Is it possible to modify a parent commit without changing the files in the current commit?

There are cases where I want to split an existing commit into two for the sake of code review or clearer version history. However, when splitting the commits, you'll find that certain inconsistencies appear or that you'll have to modify the new parent commit to maintain the build. However, I tend to have to do a lot of manual work to track/maintain the final code that I had written.

Is it possible to run a rebase without modifying the tree hash of the final commit.

Example we have 3 commits, A, B, and C: This is commit A

# test_file.py
# this is commit A
does_stuff()

This is a new commit B

# test_file.py
# this is commit B
does_stuff()

This is a new commit C

# test_file.py
# this is commit C
does_stuff()

where the original tree looked like

... -- C -- A

But we want to "split" A into two commits

... -- C -- B -- A

When we create B using an interactive rebase, A's comment will also get updated to show # this is commit B (assuming there's a few other changes other than just this one file). However, we want to keep the comment as is. Note: A will have a different overall hash due to having a different parent, but its tree hash should remain identical to before.

Upvotes: 3

Views: 309

Answers (1)

torek
torek

Reputation: 489173

It's not possible to modify any commit. Instead, you copy some existing set of commits to a new set of commits. This is actually good news, because it means the original commits remain available.

When we create B using an interactive rebase, A's comment will also get updated to show # this is commit B (assuming there's a few other changes other than just this one file).

In the specific example you showed, it shouldn't: you should get a merge conflict.

For other cases, of course, you're right.

However, we want to keep the comment as is. Note: A will have a different overall hash due to having a different parent, but its tree hash should remain identical to before.

Remember, you started with:

... -- C -- A

which I would draw as:

...--C--A   <-- branchname (HEAD)

to indicate that some existing branch named branchname points to commit C, and HEAD is attached to A.

You then ran git rebase -i <hash-of-C> or similar. This gives you a list of things to do, and you choose to "edit" A. Git now:

  • detaches HEAD to the target of the rebase:

           A   <-- branchname
          /
    ...--C   <-- HEAD
    
  • Copies commit A (using an exact copy / fast-forward, in this case, so that it re-uses A itself; you can disable this if you like with --no-ff, although in the end it makes no difference):

           A   <-- HEAD, branchname
          /
    ...--C
    

    or:

           A   <-- branchname
          /
    ...--C--A'   <-- HEAD
    

    (using --no-ff to force a copy).

At this point you would make some changes and run git add and git commit --amend to shove the current commit out of the way and make HEAD point to the new commit B, whose parent is C. Let's say you did not use --no-ff; the result is then:

       A   <-- branchname
      /
...--C--B   <-- HEAD

(If you did use --no-ff, there's an additional A' hanging out without a name; it will get garbage-collected in about a month. Then we'd have to call the next copy A" to tell them apart, so let's assume you didn't use --no-ff.)

Now you want to get the files from commit A, and the commit message from commit A, and make a new commit. Since branchname still points to original commit A, just do that:

$ git checkout branchname -- .     # assumes you're at the top level of your repo
$ git commit -C branchname         # or -c if you want to edit it again

Now you have:

       A   <-- branchname
      /
...--C--B--A'  <-- HEAD

At this point you finish off the rebase with git rebase --continue. Since there are no commits left to copy—you have finished copying the last commit, A, as far as rebase is concerned—this does the last step of a rebase, which is to peel the branch name off the original commit chain and move it to point to the same commit as HEAD, while re-attaching HEAD:

       A   <-- ORIG_HEAD
      /
...--C--B--A'  <-- branchname (HEAD)

As a side effect, rebase sets ORIG_HEAD to remember where branchname used to point, so it's easy to make sure everything worked correctly and you ended up in the desired state:

git diff ORIG_HEAD HEAD

and if that's wrong you can git reset --hard ORIG_HEAD, resulting in:

       A   <-- branchname (HEAD)
      /
...--C--B--A'  <-- ORIG_HEAD

Note that other commands, including git reset, set ORIG_HEAD (which is why they swapped here). Eventually one of these two commits will be abandoned entirely except for reflog entries, and when those expire, the unreachable commits will truly go away. The default expiry for such commits is once they are 30 days old.

Upvotes: 3

Related Questions