Lorenz Walthert
Lorenz Walthert

Reputation: 4639

Discrepancy between git log --stat and git show --stat?

I am working on an R package that is supposed to parse git repository history on a granular level. When validating some of the parsing results with an open source project from GitHub, I encountered something unexpected. In previous validation efforts of my package, I managed reconstruct the number of lines of each file in several git repositories correctly via the output of git log, so I thought using git log to obtain comprehensive information about changed files is a valid path to follow. However, I found a commit in the aforementioned project where git log does not seem to convey all information about changed files: The commit with the hash 184f6c71dee03c66c7adaacb024b70d99075ea75. When resetting HEAD to this commit and running both git log --stat and git show --log, I get this:

$ git log --stat
commit 184f6c71dee03c66c7adaacb024b70d99075ea75
Merge: 32e47a3 d203300
Author: ***
Date:   Wed Nov 12 10:39:51 2014 +0100

    merge changes from master branch

commit d203300bbe45981dab15b49c3c08deb31ad46466
Merge: 4b63f4e c8ae895
Author: ***
Date:   Wed Nov 12 10:35:36 2014 +0100

[ output truncated ]

commit 32e47a32f3cc60b5705e9df93cdc6b730fae380b
Author: ***
Date:   Tue Nov 11 18:00:55 2014 +0100

    Added the internal class template `MatrixColumnVisitor` to represent
      `VectorVisitor` concept for a column that is a `matrix`. Part of #602

 NEWS.md                                  |   3 +++
 inst/include/dplyr.h                     |   1 +
 inst/include/dplyr/MatrixColumnVisitor.h | 167 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 inst/include/dplyr/VectorVisitorImpl.h   |   4 +++-
 inst/include/dplyr/visitor.h             |  17 ++++++++++++++---
 inst/include/dplyr/white_list.h          |   4 ++++
 tests/testthat/test-filter.r             |  12 ++++++++++++
 7 files changed, 204 insertions(+), 4 deletions(-)

[ output truncated ]

And

$ git show --stat
commit 184f6c71dee03c66c7adaacb024b70d99075ea75
Merge: 32e47a3 d203300
Author: ***
Date:   Wed Nov 12 10:39:51 2014 +0100

    merge changes from master branch

 NEWS.md                                        |  4 ++--
 R/RcppExports.R                                |  4 ++++
 R/src-sql.r                                    |  2 +-
 inst/include/dplyr/NamedListAccumulator.h      | 12 ++++++------
 inst/include/dplyr/Result/LazyGroupedSubsets.h |  4 ++--
 inst/include/dplyr/Result/LazySubsets.h        |  4 ++--
 inst/include/dplyr/Result/Name.h               | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 inst/include/dplyr/Result/all.h                |  1 +
 src/RcppExports.cpp                            | 15 +++++++++++++++
 src/dplyr.cpp                                  | 29 ++++++++---------------------
 src/strings_addresses.cpp                      | 19 +++++++++++++++++++
 tests/testthat/test-joins.r                    |  7 +++++++
 tests/testthat/test-mutate.r                   | 18 ++++++++++++++++++
 13 files changed, 131 insertions(+), 34 deletions(-)

This was surprising because I thought

git log --stat

And

git show --stat

give me the same information. This is not the case, since from the git log output, I conclude that there were no files changed in the commit of interest.

When viewing the commit on GitHub or in the RStudio git tab, I can see that this commit was not empty, i.e. the information showed with git show seems correct and it appears to me that there is information missing with git log for that commit.

Any idea why there is this discrepancy? As pointed out, for a large amount of commits, I can correctly reproduce the number of lines of each file in a git repository from git log, but not for this one. I am runnig git 2.9.2 on macOS. Thanks in advance.

Upvotes: 1

Views: 610

Answers (1)

torek
torek

Reputation: 490108

By default, git log skips showing diffs for merge commits, while git show shows combined diffs for merge commits. Adding --cc (show combined diffs) to the git log options tells git log to show combined diffs (or stats for them) for merges.

Note that combine diffs have limited usefulness. For proper analysis you may want -m, which is an option that both git log and git show accept. It tells the commands to, in effect, split each merge into multiple virtual commits. A merge commit has n parents where n ≥ 2, and -m makes Git turn commit A with parents P1, P2, ..., Pn into commit A-P1 with parent P1; commit A-P2 with parent P2; ...; commit A-Pn with parent Pn, then show (or --stat) each of those commits individually.

Upvotes: 2

Related Questions