git: How do I undo a reverted merge in the distant past (pushed long ago)?

Question

Suppose I am responsible for maintaining a git repository with a layout like so:

--r1--------r2---r3----mx-----r4-------- [release]
    \                 /         \
     \---x1--*------x2 [topic_x] \
      \       \                   \
       \-----------y1 [topic_y]    \
        \       \    \              \
         `---z1------------*---------mz2----z3---- [topic_z]
                  \    \    \                \
                   `---mt1--mt2---------------mt3--- [test]

The important features of this workflow are as follows:

There are a bunch of topic branches, and they get merged into the test branch when they are ready to be tested.
If everything works OK for the functionality related to a topic branch, then that topic branch is merged into release and becomes available in production systems.
release is occasionally merged into topic branches as a simple way to bring them up-to-date.
test, being just a combination of topic branches, is fairly expendable. It can be easily recreated by merging a finite amount of topic branches.
test is never merged into anything else.

In the above example graph,

topic_x was successfully released with the mx merge.
topic_y was abandoned.
topic_z is a work-in-progress.

This workflow works pretty well for us. However, reality likes to introduce mistakes.

Mistakes like this one:

----r5----r6----------M--W---a4-- [release]
     \               /
      `-a1---a2---ma3 [topic_a]
                 /
            --mtN-- [test]

Here's what happened:

A developer accidentally merged the test branch into the topic_a branch.
The developer released topic_a, and it created the M merge commit.
When they looked at the stats at some step in the release process, they saw a long list of changes from the test branch. They immediately realized that they had done something quite wrong!
In a panic, this developer performed a git revert M and pushed the result. This created the W commit.
They collected the intended changes from the topic_a branch (commits a1 and a2), and created a patch (a4) that was directly applied to the release branch.

The damage done wasn't immediately apparent. Later we all realized that reverting merge commits is usually a very unwise thing to do; it is better to git reset M^. But it's too late for this repository, and some time passes...

Remember that topic_z? git sure does, just not the way we want it to:

----------M--W---a4--- < 1 yr of commits > ---*---rHEAD [release]
         /                                     \
------ma3 [topic_a]                             \
     /                                           \
--------------------------------------------------mz [topic_z]
   /
mtN-- [test]

release is merged into topic_z (to bring it up-to-date), and commit 'mz' is created. But something bad happens: all of the changes specific to topic_z are deleted by the 'mz' commit! git does this because the M commit already applied topic_z's changes, and the W commit removed them. So git thinks that the most up-to-date form of topic_z's changes is one in which they are all removed!

At this point, I really want topic_z back. And not just topic_z, but all other topics that might have been involved in the M and W commits. And I don't want to have to create patches for them manually and reapply them: these topics can be "non-trivial".

How do I update pre-existing topic branches while preserving original changes, such that I can work on them and merge them back into release at a later date?

To clarify: I want to be really sure that work won't spontaneously dissappear in the future as a result of merging.

Also, history-changing is acceptable, if necessary. The team is prepared to push or delete all of their current local branches and then re-clone all repository instances after any history rewrites are done. The catch is this: any history rewriting must preserve the entire repository history, not just a single branch. The intent should be to make it as if M and W had never happened.

.

Here is what I've tried so far:

Attempt #1:

git checkout release
git reset --hard M^
git checkout -b new_release
git cherry-pick a4^..rHEAD
for ((i=1;i<=100;i++)); do git cherry-pick -m 1; git checkout --ours . && git add . && git commit --allow-empty --no-edit; git cherry-pick --continue; done

This produces a 'new_release' branch with source code contents nearly identical to those on release. The results of git diff release new_release fit on a single screen. Nice! However, all of the tree topology is lost, and all of the topic branches are now referencing the wrong commits in the release branch. This approach might provide some useful knowledge, but has too many deal-breakers to be usable.

Attempt #2:

git checkout release
git rebase -p -i M^
# Remove the M and W commits from the list.
# Merge conflicts will be encountered.
for ((i=1;i<=100;i++)); do git rebase --continue; git checkout --theirs . && git add . && git commit --allow-empty --no-edit; done
# OR:
for ((i=1;i<=100;i++)); do git rebase --continue; git checkout --ours . && git add . && git commit --allow-empty --no-edit; done

The intent here is to use a rebase to remove only the M and W commits. For some reason, this results in a bunch of merge conflicts, and they have to be resolved. Using either the --theirs or the --ours strategies, the end result is a release branch that closely resembles the original in tree topology, but has a much large diff than the cherry-pick approach. It also still lacks the ability to reconstruct all of the relationships between the release and topic branches. Once again, this approach has too many flaws to be usable in its current form.

Note that the merge conflicts were not caused by the mixture of the -p and -i flags on git rebase. Imagine the M and W commits are at the top of the list in the rebase edit list: I can delete them, and rebase will have no choice but to parent everything to the correct commit. While this is not exactly true, because there were a couple other commits at the top of the list, those weren't important either and I deleted them. This rebase unambiguously parents things to the correct commit (M^).

Also note that I tried the -s and -X options with rebase before resorting to the nasty for-loop in bash. They didn't seem to have any effect at all, and allowed plenty of merge conflicts to happen, even with the --ours and --theirs strategies.

chadjoan · Accepted Answer

I have found a way to do this that works for me.

Even better yet: it's a fast-forward change!

In other words: it doesn't rewrite history, so there is no need to bug other team members about recloning repositories and such (though we will probably have to discard some unused branches after this).

First, make a clean copy of my repository (no local changes, stash, wip, etc) and pull in all of the current remote branches:

git clone /repo_url/or/path ~/temp_repo_copy
cd ~/temp_repo_copy
git pull --all; for remote in `git branch -r | grep -v \>`; do git branch --track ${remote#origin/} $remote; done
git fetch --all
git pull --all

Then create a branch off of release, called 'release_sans_revert', whose sole purpose is to update topic branches without the influence of the undesirable commits from the past:

git checkout release
git reset --hard M^
# If you need to be specific: git reset --hard r6
git branch release_sans_revert
git pull  # Fast-forward 'release' back to HEAD, where it should be.
git checkout release_sans_revert
git cherry-pick W..rHEAD
# If the cherry-pick fails, run this to resolve all of the conflicts:
# (NOTE: you must edit this line before it will run for you.)
for ((i=1;i<=100;i++)); do git cherry-pick -m 1; git checkout --ours . && git add . && git commit --allow-empty --no-edit; git reset  ; git cherry-pick --continue; done
# Be mindful of unanticipated cherry-pick failures that need
# human intervention of some manner.  Typically, such
# failures just mean that the above bash line needs to be
# tuned a bit more.
# Repeat until all cherry-picking is completed.
git diff HEAD release # And verify that we arrived at a state similar/identical to the 'release' branch.
git diff HEAD release -p | git apply -
git reset --soft M^
git add .
git commit  # with a message like so:
# Synchronize 'release_sans_revert' up to 'release'.
#
# This is a squash of many months of commits between
# a4 and rHEAD (inclusive)
# (Commits between approx.  and )
#
# This commit is one step in the process of mitigating the negative
# effects of commits M and
# W.  The former merged test code
# into the 'release' branch, and the latter reverted the former, thus
# instructing git to exterminate all of the test branch changes with
# extreme prejudice, even if they exist in other branches, and even in
# the future.  This is essentially a release branch that can catch
# things up to rHEAD without the
# damaging effects of those two merge/revert commits.

You now have a way to update the topic branches without destroying them. BUT--we aren't done yet. To be able to merge the topic branches back into the release branch, we'll have to "immunize" them from deletion by asserting that they should exist. The process for making such an assertion will look like this:

git checkout topic_z
# z3 below represents the last commit on the topic branch before
# it has any ancestry in common with M or W.
# In other words: it is the last commit on the topic branch before
# everything went all pear-shaped.
# See the first graph in the question.
git reset --hard z3
git branch topic_z_fixed
git pull  # Fast-forward 'topic_z' back to where it was.
git checkout topic_z_fixed
git merge --no-ff release_sans_revert
# There will probably be merge conflicts.
# If so, then this is the part of the process that may require human judgement.
git commit
git merge --no-ff release
# This above line should apply W's changes and try to delete all
# of your topic-specific code.  Don't let it!
# If conflicts, do this:
    git reset .
    git checkout .
    git commit
# If no conflicts: git just deleted changes specific to the topic
# branch.  I haven't tested this path.  What you probably want to do
# is revert the commit that deletes your topic code.

Now you should have a topic branch that will hold its own during merges to/from other branches. Repeat with all other topic branches involved in the original mistake.

This should give you a history like so:

          .-------------------------------mz0'------------------------mz' [topic_z_fixed]
         /                               /                           /
        /         .------------------- r1yr [release_sans_revert]   /
       /         /                                                 /
----------r5---r6----------M--W---a4-- < 1 yr of commits > --*---rHEAD [release]
     /     \              /                                   \
    /       `-a1---a2---ma3 [topic_a]                          \
   /                   /                                        \
--z3-------------------------------------------------------------mz [topic_z]
                     /
                 --mtN-- [test]

Where r1yr is a commit representing a squashed form of a4 and < 1 yr of commits >,
mz0' is the commit that brings the 'topic_z_fixed' branch up-to-date, and
mz' is the commit where we 'merge --no-ff release' and then reset/checkout/commit
to assert that the topic contents should be kept in the repository.

It should then be possible to make commits on the topic_z_fixed branch and then merge it back into release later on, with no ill effect.

topic_z was unchanged in this process, so it will, of course, still be messed up. In my case, it was safe to delete the unfixed topic branches entirely. Once you are confident that you no longer need to update any more topic branches, then release_sans_revert can also be deleted, although accidentally deleting it earlier will only cost you the trouble of having to generate it again. In that sense, this solution involves an action that is similar to a history rewrite: that of deleting branches. Fortunately, branch deletion is not quite as severe as rehashing large swathes of commits, though it does come with some small risks.

git: How do I undo a reverted merge in the distant past (pushed long ago)?

Answers (1)

Related Questions