u123
u123

Reputation: 16287

git diff/log and the order of start/end commits specified

When running git log or git diff it seems the order of how the commits are specified is important.

Tip of my branch is: 8ad4ff890446 and base of my branch is: 6e91a6615

When I run git log from tip to base I get:

$ git log --oneline 8ad4ff890446..6e91a6615
d9b526fb ...
a5f4ad92 ...
af96bb6c ...
8d416ba9 ...
c7a37d4f ...

If I reverse the arguments and run git log from base to tip I get:

$ git log --oneline 6e91a6615..8ad4ff890446
2d6055e7 ...
656ac5b3 ...
3a4e2bbc ...

The last call is actually what I would expect - only 3 commits was introduced since base.

Why do I only get 3 commits (correct) when going from base -> tip compared to 5 commits when going from tip -> base?

Upvotes: 1

Views: 689

Answers (2)

torek
torek

Reputation: 488233

TL;DR: git diff cheats; don't think of it as doing what git log does, because it doesn't.


Besides Lasse V. Karlsen's (correct) answer on git log, you need to know that git diff(ab)uses both the two and three dot notations in a way that differs from every other Git command.

The syntax I find the most sensible with git diff is this:

git diff <commit1> <commit2>

This compares the stored tree for <commit1> with the stored tree for <commit2>, so it's quite straightforward to see how it works: in effect, Git extracts the two commits, then compares them. The output is a series of instructions: add this line to file 1, remove that line from file 2, and remove file 3 entirely, perhaps. If you apply those instructions to the tree for <commit1> you get the tree for <commit2>.

If you reverse the two commits, the diff gets done with the commits swapped, so you'll get instructions that read: remove that first line from file 1, add the second line to file 2, and create an all-new file3, with all-new contents. These instructions will convert the tree for <commit2> into the tree for <commit1>.

So far so good—but what happens when you run git diff <commit1>..<commit2>? The git log syntax implies that there may be many commits here, but git diff can't show you many commits. The diff code mostly1 just compares two commits, not some arbitrarily large number of commits.

The secret is that when given the A..B syntax, git diff throws away the .. part entirely. So it really does exactly the same thing as git diff A B. If you reverse the two items, git diff B..A, Git does exactly the same thing as if you run git diff B A.

The really tricky part is what happens when you use the three-dot (or symmetric difference) syntax, git diff A...B or git log A...B. Once again, git diff and git log differ (if I may use that word here :-) ). With git log and its script-oriented sister command git rev-list, the three-dot syntax works as advertised, producing the set of all commits reachable from exactly one of the two specified commits, excluding commits reachable "simultaneously" from both commits.

What git diff does instead, though, is to find the merge base of A and B and diff that against commit B. The code it uses to do this is flawed, but works for most cases and most arguments.


1The word "mostly" is here because git diff will produce a combined diff in some cases, particularly when looking at a merge commit. Combined diffs are documented, but in a rather scattered fashion, and are a bit tricky to describe well. The key sentence about showing only files modified from all parents is only in that second linked section, where it is easy to miss when you're looking for the description about reading the output.

Upvotes: 2

Lasse V. Karlsen
Lasse V. Karlsen

Reputation: 391396

The double dot syntax doesn't mean "going from X to Y", it means this:

The most common range specification is the double-dot syntax. This basically asks Git to resolve a range of commits that are reachable from one commit but aren’t reachable from another.

A very simple example to demonstrate your output would be this:

                             master
                                v
A---B---C---D---E---F---G---H---I
             \
              \-J---K---L
                        ^
                      branch

Here master is ahead of your branch by 5 commits and your branch is ahead of master by 3 commits.

In this case, this command:

git log branch..master

should list the 5 commits E-I, whereas this:

git log master..branch

should list the 3 commits J-L.


If you're on Windows you can open a command prompt and navigate to a temporary or new (empty) folder somewhere and then paste these commands in, executing them:

git init
for %f in (a b c d e f g h i) do (echo %f>test.txt && git add . && git commit -m "%f")
git checkout HEAD~5 -b branch
for %f in (j k l) do (echo %f>test.txt && git add . && git commit -m "%f")

Afterwards you can execute these commands and inspect the output

git log branch..master
git log master..branch

Upvotes: 4

Related Questions