ahmy
ahmy

Reputation: 4345

Git log list commits excluding cherry-picked from the first branch

For example I have this commit history

git history enter image description here

Let say at some point release1 was released to production and after couple of days, release2 branch was released.

cherry-pick was never done with -x argument.

I want to get all commit for release2 that "wasn't release yet at that point" this means commit (6,7,8,9) that is marked by green circle.

What would be the easiest way to do this?

Example of the repository in github

Update 1 (git cherry)

$ git cherry -v release1 release2
- 91785f2be363244a1da4459746fce6a6ea28c2b5 4
- e88de1f8b62213089b300e3143c4caf509b5c13b 5
- 9d23e9c7e6f769c3bbb44ad42bc7b708b4b0a9e6 6
- 6fb71e4406e0b127e06f6e5b145de8a9d39db01e 7
- 6276becc413494b7724a4dd8b852234f4b71e913 8
- 6dd3b5b75222c02c921c9c20bcaee83e1c8be746 9

With cherry, it seems that I still got 4 and 5 which both are form master

Update 2 (--left-right --cherry-mark)

$ git log --left-right --cherry-mark --pretty=oneline --abbrev-commit  release2...release1
= 6dd3b5b 9
= 6276bec 8
> caa4e2f 5
= 0230a1f 4
= 6fb71e4 7
= 9d23e9c 6
= e88de1f 5
= 91785f2 4

This gives me everything except 5 is cherry-marked, which I have a hard time to explain

Update 3 with non empty commit

In original attempt, I created each commit with --allow-empty. As pointed below it created hash of an empty tree.

After recreating the the repository with real content, I get the desire result with both answer below git cherry and --left-right --cherry-mark

Upvotes: 3

Views: 1219

Answers (2)

alfunx
alfunx

Reputation: 3140

With the git cherry command you can find commits that have not been applied in an upstream branch. From man git-cherry, the synopsis looks like:

git cherry [<upstream> [<head> [<limit>]]]

and the description

Determine whether there are commits in <head>..<upstream> that are equivalent to those in the range <limit>..<head>.

The equivalence test is based on the diff, after removing whitespace and line numbers. git-cherry therefore detects when commits have been "copied" by means of git-cherry-pick(1), git-am(1) or git-rebase(1).

Outputs the SHA1 of every commit in <limit>..<head>, prefixed with - for commits that have an equivalent in <upstream>, and + for commits that do not.

So, to find commits in release2 that have no equivalent commit in release1, you could do

git cherry release1 release2 | grep '^+' | cut -d' ' -f2

Upvotes: 1

torek
torek

Reputation: 487725

Note that your linked repository contains 15 completely empty commits. By "empty commits" I don't just mean the kind of no-change commits that git commit --allow-empty allows, but rather completely empty: each commit has no files at all, just a commit message. (More precisely, every commit has the empty tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904 as its tree.) So every commit in your repository is patch-ID equivalent to every other commit in your repository.

That (mostly) explains why your update-2 section shows the output that it does: commit 6dd3b5b is patch-equivalent to commits 6276bec, 9d23e9c, and so on. The puzzle is why commit caa4e2f is not paired up with another, since it too is patch-equivalent to every other commit. In any case, this is almost an example of a false positive, but here it's really more of a true positive: these commits are all the same, and can therefore be discarded without effect.


Git's revision-walking commands (git log and git rev-list) can use git patch-id to mark patch-equivalent commits. This is not foolproof, but if a commit was cherry-picked without manual intervention, the patch IDs of the copy and the original will match.

To see how to use this, look at the git log or git rev-list documentation (the section about this is shared across both sets of documentation—as is the underlying implementation—so I linked to just one). Jump down to the section on the --left-right option. Note that --left-right only makes sense when combined with symmetric difference, which in this case means using the syntax release1...release2 or release2...release1. Note that there are three, not two, dots, between the two names here.

Using the three dots and --left-right, Git will enumerate commits 5 and 4 that are reachable from release1, and also the two different commits 5 and 4 that are reachable from release2. (Of course, Git will first enumerate commits 9, 8, 7, and 6 that are reachable from release2, before enumerating the 5 and 4 that are reachable from release2 but not release1. As you can see, it's probably a mistake to call these four different commits by the same two names. They have different hash IDs for a reason: they are different commits. Call them 5A and 5B and 4A and 4B if you like, or maybe 5.1 and 5.2 to add the release number, but give them distinct names!)

So far, that's not very much help: you have got Git showing you commits 5.1 and 4.1 from release1, marked with < or > depending on which side of the three dots the name release1 is on. And, you have Git showing you commits 9, 8, 7, 6, 5.2, and 4.2 from release2, marked with the other marker. But this is the all-important starting point. You can now switch from --left-right to --left-only or --right-only, to tell Git don't show me one of the two sets, even though you're still enumerating all the commits it.

With or without the left or right only options, you can add --cherry-mark, to tell Git: Run git patch-id on every commit in both sets. When a commit in the left-side set has the same patch ID as a commit in the right-side set, mark that commit with an equals sign =. Otherwise, mark the commit with a plus-sign +.

So, in this case if you run:

git rev-list --left-right --cherry-mark --abbrev-commit release2...release1

you'll see (assuming the hash ID of commit 9 starts with 999999, etc):

<9999999
<8888888
<7777777
<6666666
=5200000
=4200000
=5100000
=4100000

So the commits you want are exactly those marked with <. Using release1...release2, the commits you want are exactly those marked with >.

Assuming all of the above makes sense, now read the --cherry-pick section.

Note that if you had to resolve a conflict when cherry-picking, the patch IDs of the original commit and its copy won't match. Such a commit won't be omitted by this process. It's also possible to get a false match. This tends to occur with a fix that just consists of, e.g., moving some opening and closing braces around: the patch ID is formed by stripping not only the line numbers, but also whitespace. So reducing two different (by indentation and line number) brace-changing commits to patch IDs collapses commits that should be different. Hence my statement at the top about this not being foolproof: you can get both false negatives (cherry-pick with merge conflict resolved) and false positives (conflated commits that are actually different).

Upvotes: 5

Related Questions