Brad
Brad

Reputation: 3510

Using git rebase interactive to orchestrate series of git cherry-pick?

git cherry-pick allows simple merges to be cherry-picked simply by indicating which of the merge parents should be used as the baseline. For example:

git cherry-pick -m 1 1234abcdef

I have a bunch of commits that I want to cherry pick, a few of which may be merges, and others will not be. I would like to be able to use git rebase to cherry-pick all these commits with something like this:

git rebase -i --onto myBranch myBranch 

and put the pick list into the interactive file:

p 1234
p 3224
... a bunch more picks
p abcde

And, if git rebase encounters a merge in ones of those commits, I want to specify the equivalent of cherry-pick's -m 1 option to indicate the change should be picked against the first parent.

I have tried a number of the merge-related options to rebase, but I always end up getting the error:

commit 3c4ffe04532 is a merge but no -m option was given.

(even if I specify -m to rebase.)

I realize I could write a script using cherry-pick, but I like the existing behavior rebase -i (it runs through the command list and pauses if it gets to something it can't handle). I'd very much like to leverage that logic directly, but I haven't been able to figure out the right way to finesse rebase's pick command to fill in this gap.

Is there a way to get rebase to adopt cherry-pick's -m # behavior for pick?

To state my goal another way and help clarify the question - I want to use git-rebase's --i capability to coordinate a series of git cherry-picks so that any merge conflicts in the process can be resolved manually and then the process can be managed with --continue, --abort and/or --skip.

This would be useful because a simple script consisting of:

git cherry-pick -m 1 e1bed15c97f3f
git cherry-pick -m 1 6b5e6060b0e99
....
git cherry-pick -m 1 1a625d6b45faa

is likely to abort with an error like this:

error: could not apply 6b5e6060b0e99... Implement Something... 
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'

d:\src>git cherry-pick -m 1   e1bed15c97f3f
error: Cherry-picking is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm <file>'
hint: as appropriate to mark resolution and make a commit.
fatal: cherry-pick failed

Thanks!

Upvotes: 5

Views: 4435

Answers (1)

torek
torek

Reputation: 487885

mnestorov's comment about using --rebase-merges is relevant here; consider it for the actual problem you're trying to solve (which you haven't really described: I'll note below where things seem to have gone off the rails). It is possible you may be doing something that's a little too tough for git rebase as it stands today. But I think you're doing exactly what -r is designed for.

If -r works, you're done. If your Git is old, you may not have the -r / --rebase-merges option. If so, the best answer is to upgrade your Git.

More about rebase

Let's talk more about rebase in general, starting with this:

Is there a way to get rebase to adopt cherry-pick's -m # behavior for pick?

No: if there were, it wouldn't work anyway, at least not in general. Here's why.

When you use the -m option here to "copy" a merge, you copy it into a non-merge, ordinary commit. The -m option makes Git treat the merge commit as if it were an ordinary commit, with a single parent, and the -m flag tells it which parent to call "the" single parent. But the purpose of a merge commit is, in general, to combine work, from two parents.1

Meanwhile, the purpose of git rebase is to repeatedly copy some commit(s), subsequently abandoning the original commits in favor of the new copies. It's not possible to copy a merge commit—cherry-pick's -m doesn't do that; it produces an ordinary commit instead—so rebase normally discards merge commits. I'll show how and why this is the right thing, for the way standard rebase works, below.

git rebase -i --onto myBranch myBranch

Note that the argument to --onto, and the other argument that the git rebase documentation calls upstream, default to being the same thing, so this is more simply written as:

git rebase -i myBranch

The set of commits that are to be copied by this action are limited to no more than those produced by:

git log myBranch..HEAD

That is, suppose we have the following, where newer commits are towards the right and we are currently on branch topic:

          G--H   <-- topic (HEAD)
         /
...--E--F--I--J   <-- myBranch

Running git rebase myBranch, with or without --interactive, tells Git: First, list out those commits reachable from HEAD aka topic, minus any commits that are also reachable from myBranch. That causes Git to list out commits G and H internally. These are the candidates for copying.

If these are the commits that will be copied in the end, and other simplifying assumptions hold, the result is going to be:

          G--H   [abandoned]
         /
...--E--F--I--J   <-- myBranch
               \
                G'-H'  <-- topic (HEAD)

where G' and H' are the copies of original commits G and H, with the copies having two important differences each:

  • The parent of G' is J, instead of F. The files within G' contain the same changes to J that G made with respect to F, so the snapshot stored in G' differs from that in G.
  • The parent of H' is G', instead of G, and its snapshot differs in much the same way.

That is, since each commit holds a full snapshot, we need the snapshots in the copied commits to differ from the snapshots in the originals. The difference in the new snapshots is such that comparing G' vs J produces the same diff, more or less, as comparing G vs F. And of course, the linkage—which in Git always goes backwards—is also different, so that the copies come after the last commit in myBranch.


1An octopus merge, if you have one, combines work from more than two parents, and the rare -s ours merge discards the contents of one parent entirely, so these special cases are even more special; in general, rebase should not be used on these.


What rebase doesn't do on purpose

Suppose that in our original two commits, G and H:

          G--H   <-- topic (HEAD)
         /
...--E--F--I--J   <-- myBranch

the change from G to H is exactly the same as the change from I to J. For instance, both commits fix the spelling of the same misspelled word in the README file, and do nothing else.

When we run git rebase myBranch, Git still lists out commits G and H. But it also looks at commits I and J, and for each commit, Git computes what it calls a patch ID (see the git patch-ID documentation). This patch ID tells Git: Commit H is a duplicate of commit J. Git then drops commit H from the list of commits to copy.

So when we say that rebase lists out myBranch..HEAD to get candidate commits to copy, these are just candidates. Some of these candidates are automatically eliminated, on purpose. In this particular case, where only H is eliminated on purpose, the eventual result of the rebase would be:

          G--H   [abandoned]
         /
...--E--F--I--J   <-- myBranch
               \
                G'  <-- topic (HEAD)

Git basically believes that commit H is already applied. So it drops it entirely.

There's a rather complicated dance that Git also does with something it calls the fork point code. The goal of the fork-point code is to discover commits that were deliberately dropped and automatically drop them during rebasing. This code usually does the right thing, though it can misfire.2 Neither the patch-ID nor the fork-point code seem to be biting you in this case, but there's one more big special case, which deserves its own section.


2The fact that it can misfire makes me think it's not necessarily the right default. That applies to the "already applied upstream" patch-ID case as well. In particular, interactive rebase really should include these commits on its instruction sheet, with the pre-selected action being "drop", and a comment as to why they're being dropped. This is not the case today.


Merges

So far, the pictures we have drawn are simple. But suppose our topic branch commits look like this:

                 I--J
                /    \
            G--H      M--N   <-- topic (HEAD)
           /    \    /
          /      K--L
         /
...--E--F--------------O--P   <-- myBranch

When we run:

git log myBranch..topic

we'll see commits N and M, and then—in some order—commits I through L, with I showing after J but randomly-ordered with respect to K and L, and with K showing after L but randomly-ordered with respect to I and J. Then we'll see commit H, and then G, and that's the end of the list.

(If we add --topo-order, the order of the list gets more constrained. The rebase code internally adds --topo-order. We still don't know whether L or J will come first, but once we get one of those, we'll finish the entire row before going to the other row. Without --topo-order we could see N, M, L, J, K, I, H, G instance.)

Here's where your question goes off the rails a bit. The git rebase command will automatically drop merge commit M entirely, for two reasons:

  • cherry-pick (and by extension, the old git format-patch / git am based method) can't copy a merge; and
  • the result of a standard rebase shouldn't copy a merge anyway.

So you won't have a pick command for commit M. To get one, you must be manually inserting your own, and this is a mistake. To see why, let's look at how Git handles this without the pick <hash-of-M> in it, with a regular (non --rebase-merges) rebase.

The sequence starts by listing out the commits to copy. Let's say they come out in this order, after git rebase carefully reverses them3 while dropping merges: G-H-I-J-K-L-N.

If all goes well during the copying stages, the result will be:

                 I--J
                /    \
            G--H      M--N   [abandoned]
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                G'-H'-I'-J'-K'-L'-N'  <-- topic (HEAD)

That is, git rebase has flattened the merge away. But the purpose of merge M was to combine the work on the I-J and K-L branches. We don't need that merge, because the process of copying K to K' was:

  • for each change in the H-vs-K commit, make the same change to the contents taken from I';
  • now commit that as new commit K'.

That is, commit K' is based not on H or H', but on I'. It already contains the other branch's work. Likewise, when Git copies L to L', it does so onto a commit that already contains the other branch's work. So there's no need for the branching. The rebase operation simply flattens it away entirely.


3Remember, Git works backwards, so that the list always comes out with N first. We need N last, so rebase reverses the list.


The --rebase-merges option

This idea of flattening away merges is not always a good one. Sometimes it doesn't work very well. Sure, a series like:

       I--J
      /
...--H
      \
       K--L

usually has relatively few changes on both branches, so that "flattening the branch away" is easy and goes well. But what if the series has a ton of commits in each branch:

       o--o--...(1000 commits)...--o--tip1
      /
...--o
      \
       o--o--....................--o--tip2

In this case, the merge that merges the two tip commits might have a lot of work in it. Flattening the merge away is impractical.

Or, maybe we just like having the merge present. The merge represents something important, and we want future code archaeologists to see it.

Well, "copying" the merge is indeed impossible. Cherry-pick's -m flag doesn't do that. If we "copied" the merge with cherry-pick -m after flattening things:

                 I--J
                /    \
            G--H      M--N   <-- topic
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                G'-H'-I'-J'-K'-L'  <-- HEAD

we'd just be re-introducing the changes we already got via either I-J, or via K-L. To "copy" the merge correctly, we have to form a branch first:

                 I--J
                /    \
            G--H      M--N   <-- topic
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                \      I'-J'   <-- temp-label-1
                 \    /
                  G'-H'
                      \
                       K'-L'   <-- temp-label-2, HEAD

Then we have to pick the correct branch tip to be the HEAD commit, and literally run git merge over again to make M':

reset-to temp-label-1
merge temp-label-2

If the merge goes well, we'll now have:

                 I--J
                /    \
            G--H      M--N   <-- topic
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                \      I'-J'  <-- temp-label-1
                 \    /    \
                  G'-H'     M'  <-- HEAD
                      \    /
                       K'-L'  <-- temp-label-2

We can now pick hash-of-N to make N':

                 I--J
                /    \
            G--H      M--N   <-- topic
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                \      I'-J'  <-- temp-label-1
                 \    /    \
                  G'-H'     M'-N'  <-- HEAD
                      \    /
                       K'-L'  <-- temp-label-2

and then we're done with this fancy rebase-that-re-does-the-merge, and can move the branch label topic and drop any temporary labels:

                 I--J
                /    \
            G--H      M--N   [abandoned]
           /    \    /
          /      K--L
         /
...--E--F--O--P   <-- myBranch
               \
                \      I'-J'
                 \    /    \
                  G'-H'     M'-N'  <-- topic (HEAD)
                      \    /
                       K'-L'

This is what git cherry-pick --rebase-merges does. To achieve this result, it needs some extra commands and the ability to insert temporary labels. (Note that there will also be a temporary label for H' since the sequence of cherry-picking operations has to reset HEAD there before copying K to K'. You will see all this labeling and resetting in the instruction sheet, which needs to know when to make the various labels and where to move HEAD around.)

Upvotes: 8

Related Questions