Atihska
Atihska

Reputation: 5126

How to git revert merge commits done by cherry-picking

I want to revert these 3 commits 295e8ce6e63e39b87548f2f52b3eb0d5139999ef, fb690eaec7938abbd614b0b07a5364998c09f7a5 and 92ed5ce3d6b4cb70e56f74f17e80df960311b647. I was able to revert the last one 295e8ce6e63e39b87548f2f52b3eb0d5139999ef but unable to revert the bottom 2. I tried the following:

$git revert fb690eaec7938abbd614b0b07a5364998c09f7a5 -m 1 
$git revert 92ed5ce3d6b4cb70e56f74f17e80df960311b647 -m 1

And I get

Already up to date!
On branch feature
Your branch is up to date with 'origin/feature'.

nothing to commit, working tree clean

But my git log does not reverts the last 2 commits and it remains the same:

commit d81838a9dc973acbe03865dce0533bd41e77860e (HEAD -> feature, origin/feature)
Author: abc
Date:   Thu Nov 5 14:08:59 2020 -0800

    Revert "merge PR #173"
    
    This reverts commit 295e8ce6e63e39b87548f2f52b3eb0d5139999ef.

commit 295e8ce6e63e39b87548f2f52b3eb0d5139999ef
Author: xyz
Date:   Fri Aug 28 22:30:12 2020 -0700

    merge PR #173

commit fb690eaec7938abbd614b0b07a5364998c09f7a5
Merge: 3cef115 92ed5ce
Author: abc
Date:   Thu Nov 5 12:29:19 2020 -0800

    Merging changes from e6a35ad0b2363932ac190ec602a7fd0c8bf9f04f

commit 92ed5ce3d6b4cb70e56f74f17e80df960311b647 (temp)
Author: xyz
Date:   Wed Sep 2 17:32:07 2020 -0700

    readjust null values for double data type from v5

Upvotes: 1

Views: 3045

Answers (1)

torek
torek

Reputation: 488003

The behavior you are seeing is almost certainly normal. (I cannot be sure without access to the repository in question.)

I can reproduce something very similar, where git revert does nothing:

$ git revert 6af46d5b31c241a666c90fb77ae54baac842b251
On branch master
nothing to commit, working tree clean

Note that I am not reverting a merge commit here. The fact that you're reverting a merge isn't relevant to the revert-result, but does explain why you are in the situation that you are in, in which the reverts you are asking for do nothing. There is nothing to do, so Git does nothing.

But what is that explanation? Why is no more reverting necessary? Well, for that, we have to see what revert is, and how it works. It's actually directly related to git cherry-pick: in fact, both git cherry-pick and git revert use the same code.

Understanding git revert

Let's look first at the goal of a git revert, then at how Git accomplishes the revert. Remember that in Git, the verb revert—which literally means turn back, with a French and Latin etymology, descending from revertere—does not mean "return to" but rather means something more along the lines of "undo". In the English language (including the peculiar Americanized version of English, where -ised is spelled -ized), the verb revert is often used with the auxiliary preposition to, as in revert to (some previous state). That's not what Git means by revert, though. Git's meaning is more of a back out: we want Git to take some commit, and treat it as a change to some set of files. Having found out what file(s) changed, in what way, we want Git to now back out those changes: i.e., take what state we have now and apply a reversal of the changes found in the other commit.

So, we want Git to pretend that a commit is a change—but commits aren't actually changes! Each commit holds a full snapshot of every file. Fortunately, each commit also holds the hash ID of the previous commit. Every commit has a unique number, or hash ID, and each commit remembers the hash ID of the previous commit. So Git can take the string of commits that you'd see in git log output:

... <-F <-G <-H

where H is some hash ID of some commit, and extract not only H's snapshot, but also earlier commit G's snapshot. Git can then compare the G-commit snapshot to the H-commit snapshot. For files that are the same, nothing happened to those files. For files that exist in both commits, but are different, whatever is different in those files is a change. If there are files in G that aren't in H, those files were deleted, and if there are files in H that aren't in G, those files were newly added.

This is always how Git presents commits to you as changes: by walking through the commit graph, made up of each commit's connection to its parent. When you have:

...--F--G--H   <-- somebranch

the name somebranch allows Git to find the latest commit H, and H allows Git to find earlier commit G, which allows Git to find earlier commit F, and so on. From these snapshots, Git can show you changes. Git simply compares the parent commit, such as G, to the child commit such as H.

If we have a long chain like this:

...--C--D--E--F--G--H   <-- somebranch

and the change we made in E, i.e., the D-vs-E difference, turns out to be a bad idea, we can just tell Git: revert the changes we made in E. Git will now extract both D and E, see what changed, and try to undo that same change. If we added a new file newfile, Git can remove the file newfile entirely. If we added some lines to some existing file, Git can remove those lines from that file. If we removed some lines from some file, Git can put those lines back.

But there's an obvious problem. Suppose we added a new file lib.py with two library routines, in commit E. Suppose that by commit H, lib.py has a dozen library routines. Removing lib.py entirely will break everything. So the back-out-some-changes action that git revert has to perform must be tempered somehow. There are many possible "somehow"s here; the one Git chooses is to use its three way merge engine.

Before we look at how git revert uses Git's merge engine, we must first look at git merge itself.

Merging: to merge as a verb and a merge as a noun

In Git, we typically use git merge—or run git pull to make Git run git merge—to perform the merging action, which often results in a merge commit. The word merge in the phrase a merge commit acts as an adjective, but we often call a merge commit "a merge" for short. Here the word merge acts as a noun.

We produce the snapshot for a merge—or a merge commit, if you prefer—by doing some particular set of actions. These actions are the process of merging. They implement the verb form of merge, to merge (some files and/or some commits). The exact set of actions do of course matter, but most importantly, in order to do this merging action, we need not two but three inputs. We have some original thing—a commit, or a file within a commit—and two modified versions of that original thing, which we call "ours" and "theirs". We have the merge engine, which is the code that produces the final file or files, examine all three of these inputs and figure out what the result should be. See also three way merge in Merge (version control) in Wikipedia and/or Why is a 3-way merge advantageous over a 2-way merge?

When merging separate branches, this all makes a particularly good amount of sense. The three inputs are three commits. We have some series of commits that look something like this:

          I--J   <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

We are on branch br1, at commit J. We run git merge br2 to merge commit L. The common starting point of this process is commit H: the best shared commit, on both branches. The merge engine proceeds by comparing the snapshot in H—what we both started with—to the snapshot in J to see what we changed, and comparing H to L to see what they changed. The merge engine then combines our changes with their changes, applies the combined changes to the snapshot in H, and produces the final snapshot.

When we use git merge to invoke the merge engine, the git merge commit itself makes the resulting merge commit M like this:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

and now we have a merge (noun) produced by merging (verb) commits J and L using commit H as the common starting point.

How git cherry-pick uses the merge engine

Imagine for a moment that we have a branch-y situation like this:

          I--J   <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

That is, we're on br1 and doing some work. We realize at this point that we need exactly what we, or someone anyway, did in commit L, because it fixes some bug. But we're not ready to get the effect of commit K, so we don't want to merge br2 completely. Instead, we'd like to examine commit L, see what changed in commit L, and grab the same changes for ourselves.

This is what git cherry-pick does: it compares the snapshots in, say, K and L to each other, to see what changed in L, then it makes the same changes where we are, at commit J, and makes a new commit. But the way it does this is a little sneaky: it uses the same merge engine as git merge.

Suppose we designate commit K as the (fake!) merge base or shared common commit, and use commit J as our current commit because it is. Meanwhile we use commit L as "their" commit. That is, these are the only "interesting" commits:

             J   <-- br1 (HEAD)
   
          K--L   <-- br2

We have Git, through its merge engine, compare K vs L to see what they changed: those are the changes we'd like to add to our commit. Then we also have Git compare K vs J, to see what "we changed": those are differences from K that we would like to keep.

To the extent that these two sets of changes don't conflict, it's safe—well, as safe as Git can automate, anyway—to take the changes from L. To the extent that they do conflict, Git will declare a merge conflict, pause the operation, and allow us to resolve the conflict.

If all goes well, Git makes an ordinary non-merge commit at the end:

          I--J--L'  <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

where L' is the copy of L. If not, Git pauses with a merge conflict. We fix the conflict and run git cherry-pick --continue or git commit to make commit L'.

Revert is just a cherry-pick done backwards

If we have:

             J   <-- br1 (HEAD)
   
          K--L   <-- br2

and compare K-vs-L to see what they changed, and K-vs-J to see what we changed, we get a cherry-pick operation. But what if we have Git pick commit L as the "base" commit, K as "theirs", and J as "ours"? Then the changes we have, that we want to keep, are those from L to J, and the changes from L to K—i.e., the inverse of the changes they actually made—are the changes we'd like to add.

This works whether or not commits K and L in our own past! If they are—if we have something like this:

...o--o--K--L--o--o--o--J   <-- branch (HEAD)

then L-vs-J is everything we want to keep, and L-vs-K is what we want to "add". It's already the verse of K-vs-L, so it will "undo" or back out the changes from K to L.

So git revert simply applies the merge engine as before, with the child commit (L) as the common starting point, the parent commit to reverse (K) as their change, and our commit J as our commit. If all goes well, Git makes a new commit on its own:

...o--o--K--L--o--o--o--J--Γ   <-- branch (HEAD)

where commit Γ "undoes" L (gamma looks like an upside-down L).

When does this do nothing?

Whether it's a cherry-pick or a revert, this action does nothing when the changes we're asking Git to add are already there. Suppose, for instance, we have:

          I--J   <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

as in the original cherry-pick example. Suppose further that commit L fixes a bug in commit K, but that commits I and/or J already fixed the same bug (and maybe did more stuff too). If we run git cherry-pick br2 to ask Git to copy commit L here, Git will compare K-vs-L to see what they changed, and compare K vs J to see what we changed. What we changed already includes everything, and there is nothing to add. The cherry-pick simply stops. Due to a quirk of the implementation—cherry-pick ends by running an internal commit—you also get the messages about there being nothing to commit.

The same goes for a revert: if we ask to revert L, but somewhere in our work, we already achieved that, there's nothing to do. The git revert tries to commit, but there is nothing to commit, and you get the messages you see.

Why is reverting a merge special?

In some ways, reverting—or cherry-picking—a merge isn't special at all. But there is one way it is. Remember that a merge commit has two previous snapshots, and that both git cherry-pick and git revert need to find the parent of the child commit that you name in your git command. A merge doesn't have the parent: it has two parents (or more, in the case of an octopus merge, but we won't go there).

You must therefore name which parent you want Git to use as the merge-base (cherry-pick) or "their" commit (revert). That's the -m 1 in your command line.

The other thing that we have to consider is what's in the merge's snapshot. Remember that a merge commit, like our M in this example:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

is built by taking the base H and adding all the changes from H-vs-J to all the changes from H-vs-L.

To this, we might add some more commits if we like:

          I--J
         /    \
...--G--H      M--N--O   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Now suppose we ask Git to revert M, using J as the parent. Git will:

  • treat M as the starting-point for the merge;
  • treat O as "our" commit, hence find M-vs-O as what to keep; and
  • treat J as "their" commit, hence find M-vs-J as what to add.

If we don't add any commits, this simplifies a bit:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Git now treats M as both the base commit and our commit:

  • find M-vs-M as what we changed that we should keep (nothing at all);
  • find M-vs-J as what they changed that should be added to this.

Adding M-vs-J, what "they changed", to nothing, of course gives whatever it takes to convert the snapshot in M back to that in J. So the result is:

          I--J
         /    \
...--G--H      M--J'  <-- br1 (HEAD)
         \    /
          K--L   <-- br2

where J' indicates that the snapshot in the new commit matches the snapshot in commit J. We've just undone the entire merge, as far as source snapshots go.

Now, what if commit L was itself the result of an earlier merge? That is, what if instead of the above, we started with:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
        |     /
        : _--L   <-- br2
        ::  /
         :..

where there is some big messy history between H and L, so that L is a merge. No matter how we got there though, the snapshot in L combines all the work from H up to that point. If we now revert M using J as the parent, to get a new commit J', commit J' has undone everything we brought in via L, including any merges made to arrive at L. So reverting all merges in the history leading from H up to L has no effect now.

That, I think, is what is happening in your case. Git is doing nothing because there is nothing to do.

Upvotes: 4

Related Questions