alabamajack
alabamajack

Reputation: 662

Git - Revert merge with history of one branch

I have the great job to clean up a fu**ed up git repository. In the past someone merged the whole linux kernel src into the repository (with all the 650k commits). I know the commit id from the merge and also from the parent. Of course there were changes in the time between merging linux with the masterbranch, so at the moment the tree looks similar to this

-x-x-x-x-LinuxMerge-x-x-x-x-x-x-x-x-x-today

What I want is to revert the LinuxMerge commit incl. the history of this. Is this possibly and how?

Upvotes: 2

Views: 1092

Answers (3)

Mark Adelsberger
Mark Adelsberger

Reputation: 45659

I think some confusion is raised with this question, because you phrase it as wanting to "revert" the mistake - and "revert" means something specific to git. I know what you mean isn't what git means by that word, because with git's meaning a phrase like "revert history" isn't a thing.

Because you want to undo the change entirely, a history rewrite is the first step. AnoE's answer shows one way to do this, assuming that there is just one ref from which the bad merge is reachable, and that there are no merge commits "between" that ref and the bad merge.

In the event there are multiple refs, you'd need to do something more. For example if you had

x -- x -- x --- ML -- x -- A -- x <--(master)
               /            \
(linux history)              o <--(branch_one)

completing the rebase would give you

            x' -- A' -- x' <--(master)
           /
x -- x -- x --- ML -- x -- A -- o <--(branch_one)
               /
(linux history)

You'd then need to transplant the o commit, with something like

git rebase --onto A' A branch_one

(replacing A and A' with either the commit ID or some other expression that names the appropriate commit).

If there are merges that need to be rewritten, then you have a bigger issue. The rebase command will try to write a linear history by default. You can tell it you want to keep the merge topology with the --preserve-merges option, but it may not work properly. If a merge commit had conflicts, you'll have to re-resolve it. Worse, if a merge commit doesn't have conflicts, but was not originally completed using the default merge result, then rebase will not recreate the merge (or any children of it) correctly.

So the only safe way to rebase, then, is in segments, manually reproducing merges as you encounter them.

Another option might be to use git filter-branch instead of rebase; but this is tricky, too. It's only workable if you can script the removal of anything the merge introduced. For example, if the linux history is in different paths than your own work, so that you could clean up a given instance of the content by rming certain paths, then you could use filter-branch.

(Since this is an option that may or may not be viable for you, for now I won't spell out the detailed steps. The filter-branch documentation can fill in the blanks. Basically you'd use a parent-filter to bypass the merge commit (by re-parenting the following commit onto the first parent commit), plus an index-filter or tree-filter to remove the linux files from the subsequent commits.)

One way or another, once you have the history cleaned up you would still have all that history in your repo's database. At a minimum you need to make sure nothing references that history. Then it would eventually get cleared out by gc (or you could force that to happen sooner).

Mostly that means you have to find any refs that can reach the linux history. Since the rewrite moved "your" refs, this would likely comprise any refs (branches or tags) pulled in with the linux history itself. So you'd just want to delete those.

There also will be reflogs that can (indirectly) reach the linux history, and gc can't remove history that's reachable in this way. Honestly at this point the easiest thing to do is probably to re-clone the repo (as a new clone should only get the current refs and their history) and replace origin with the result.

If you want to repair an existing repo instead of re-cloning for whatever reason, the next step would be to wipe out reflogs (I usually just rm -r .git/logs) and then run an aggressive gc (see the gc docs)

Upvotes: 2

AnoE
AnoE

Reputation: 8345

You can undo this by rebasing.

If you start out from this...

-x-x-x-x-LinuxMerge-x-x-x-x-x-x-x-x-x-today

... then you probably are talking about that, instead:

-x-x-x-x-x-x-x-x-x-x-x-x-x-today
        /
-linus-/

Let's label some more commits:

-x-x-x-prev-merg-post-x-x-x-x-x-x-x-today
             /
     -linus-/

So, you want to glue prev and post together, and throw away merg. The command for this is:

git rebase merg today --onto prev

(Note that in the command, we mention merg, not post; this is the typical "+-1" issue with declaring commit ranges in git).

This rebase command will add a new line of commits and change the today branch to point at the new tail:

          post'-y-y-y-y-y-y-y-today'
         /
-x-x-x-prev-merg-post-x-x-x-x-x-x-x-today
             /
     -linus-/

And if you just ignore the older stuff, this flattens out to:

-x-x-x-prev-post'-y-y-y-y-y-y-y-today'

The rebase will also change the today branch to point at the commit labeled today' in this ASCII picture.

Note that post' and the y commits (as well as today') will all have different hashes than the originals, they are not the "same" commits.

If no other tags or branches point to the history leading up to linus, then those commits and related objects will be purged eventually by the git garbage collection (which you could force with git gc to make sure).

Upvotes: 1

Enrico Campidoglio
Enrico Campidoglio

Reputation: 59913

You have a couple of options here.

Option 1: Rewriting history

If you are able to rewrite the history of the master branch without consequences, the quickest way to achieve what you want is to simply remove the merge commit altogether with git rebase --onto:

git checkout master
git rebase --onto <SHA-1-of-the-linux-merge>^ <SHA-1-of-the-linux-merge>

This means: "rebase master on top of the first parent of the merge commit starting from the merge commit itself". This will effectively remove the merge commit and apply all subsequent commits on top of its first parent. You can read more about how git rebase --onto works here.

Option 2: Reverting the merge

If you want to avoid rewriting history, you can always revert the "LinuxMerge" commit by using git-revert:

git revert --mainline-parent 1 --no-commit <SHA-1-of-the-linux-merge>

The --mainline-parent option tells Git which parent of the merge commit you want to revert to. In this case, you want to revert to the first parent, that is the commit where the Linux kernel was merged to.

From the documentation:

Usually you cannot revert a merge because you do not know which side of the merge should be considered the mainline. This option specifies the parent number (starting from 1) of the mainline and allows revert to reverse the change relative to the specified parent.

Note that reverting a merge this way will cause later merges of the same branch to exclude the commits that were originally brought in by the reverted merge:

Reverting a merge commit declares that you will never want the tree changes brought in by the merge. As a result, later merges will only bring in tree changes introduced by commits that are not ancestors of the previously reverted merge. This may or may not be what you want.

However, in this case it sounds like you won't be merging the Linux kernel again anytime soon.

As for the --no-commit option, it lets you do a dry run to see whether you get any conflicts in your working directory without actually creating the commit.

Upvotes: 0

Related Questions