guettli
guettli

Reputation: 27806

Show branch in the output of `git log -G foo`

How to show the branches which contain the commit in the output of git log -G foo?

Up to now it looks like this:

commit a24dc0cd5403b697634976f2f7eef4aa7af61b3d
Author: Thomas Guettler <[email protected]>
Date:   Mon Aug 8 11:16:30 2022 +0200

    use timezone, and two tests for one day.

commit 8ffe418ff64b899958cf9da594852a13dc993673
Author: Thomas Guettler <[email protected]>
Date:   Mon Aug 8 11:11:29 2022 +0200

    removed debug code.

I would like it to look like this:

commit a24dc0cd5403b697634976f2f7eef4aa7af61b3d
Author: Thomas Guettler <[email protected]>
Date:   Mon Aug 8 11:16:30 2022 +0200
Branches: feature-foo

    use timezone, and two tests for one day.

commit 8ffe418ff64b899958cf9da594852a13dc993673
Author: Thomas Guettler <[email protected]>
Date:   Mon Aug 8 11:11:29 2022 +0200
Branches: main feature-foo

    removed debug code.

Upvotes: 6

Views: 302

Answers (3)

FelipeC
FelipeC

Reputation: 9488

git doesn't have this functionality readily available, but you can do essentially the same git log does by doing git show in every commit. The reason you would want to do it per-commit is that git log might take some time to find every commit, and in the meantime you can see the output of the ones already found. Also, you can use less and see them as you scroll.

All we have to do is add the information we want in the format of git show itself:

git log --format='%H' "$@" | while read -r id; do
    IFS=$'\n' read -r -d '' -a branches < <(git branch --format='%(refname:short)' --contains "$id")
    git --no-pager show --format="commit %H%nAuthor: %an <%ae>%nDate: %ad%nBranches: ${branches[*]}%n%n%w(76,4,4)%B" --quiet "$id"
done

To make it a proper command you can create a binary git-log-branches anywhere in your $PATH, and then you can call it with git log-branches. Also, you can wrap the command inside less < <(...) to be paged like other git commands.

#!/bin/bash

less -FRX < <(
...
)

I saw kriegaex's time comparison, but it isn't realistic, the whole point of the question is to search for a regex with -G. If we actually do that in a real repository a good chunk of the time is spent waiting for input, and the solutions work very differently.

In a real repository with 11874 commits, doing a -G search returns 28 commits but it takes a while to get them all (~10s). Using a pager we don't have to wait, we can see the first commit as soon as it's available, and in that case the two solutions provide very different results from the time they are launched to the time the first commit is available:

  • kriegaex: 4.688s
  • felipec: 0.167s

Even when no commit limiting and no pager kriegaex's version is 11% slower.

I created a test script which compares all the versions in a repository specifically built for this with 1000 commits.

id seconds diff
felipec 5.72 0%
felipec (original) 5.02 14%
kriegaex 6.44 -11%
kriegaex (new) 5.10 12%

Even kriegaex's new version is not as fast as my original version. The only reason I did not submit my original version is that the code is more complicated and the advantage is not that great (only 14% faster), but here it is for reference:

while read -r -d $'\0' commit; do
    id=${commit#commit }
    id=${id%%$'\n'*}
    IFS=$'\n' read -r -d '' -a branches < <(git branch --format='%(refname:short)' --contains "$id")
    echo -e "${commit/BRANCHES/${branches[*]}}\n"
done < <(git log --format="tformat:commit %H%nAuthor: %an <%ae>%nDate: %ad%nBranches: BRANCHES%n%n%w(76,4,4)%B%w()%x00" "$@")

Upvotes: 3

kriegaex
kriegaex

Reputation: 67297

Updated answer

I just had a simple idea which contains both

  • the performance advantage of ~20% of my original answer compared to FelipeC's, avoiding to call both git log (once) and git show (for each commit), and
  • the advantage of FelipeC's better pageability (found commits are processed one by one, not after processing the whole output like in my original answer).

This is achieved by no longer using GNU sed's subshell execution mode but by reading the output line by line, replacing the Git hash in lines "Branches: 8ba14..." by the result of git branch --contains 8ba14.... In this simple solution, I do not even use sed or awk anymore, just Bash's built-in substring functionality:

#!/usr/bin/bash

#less -FRX < <(
git --no-pager log --quiet --pretty=format:"commit %H%nAuthor:   %an <%ae>%nDate:     %ad%nBranches: %H%n%n%w(76,4,4)%B" "$@" | while read -r LINE; do
  if [ "${LINE:0:10}" = "Branches: " ]; then
    echo "Branches: $(git branch --contains ${LINE:10} --format='%(refname:short)' | tr '\n' ' ')"
  else
    echo "$LINE"
  fi
done
#)

Uncomment less paging according to your own preference. Thanks to FelipeC for discussing and comparing our original solutions with one another, helping me come up with this improved version. 🙂

Original answer

Out of the box, Git cannot embed a shell script in its pretty-format string. Here is a quick & dirty solution. It kind of mimics the format you want, but without colour and without some of the extra information Git might print in certain situations. But I hope it gives you a clue how to continue and refine it.

Preconditions:

  • You use a UNIX-like shell, I tried Git Bash on Windows.
  • You use GNU sed.

Define this shell function in your profile or directly on the console:

git_log_branches() {
  git log --pretty=format:"commit %H%nAuthor:   %an <%ae>%nDate:     %ad%nBranches: %H%n%n%w(76,4,4)%B" "$@" |
    sed -E "s/^(Branches: )(.*)/echo -n '\1'; git branch --contains \2 --format='%(refname:short)' | tr '\n' ' '/e"
}

The log output would look something like this:

$ git_log_branches -2 HEAD~100
commit d81a845b61f5b98b217722122c6005cb51f9e160
Author:   Alexander Kriegisch <[email protected]>
Date:     Sun Jun 6 13:27:03 2021 +0700
Branches: main openj9-jit openj9-jit-exclude

    Integration test POM (group ID) + UML whitespace cosmetics

commit 25eafcc93340ee2ee6ce05d0ec1a2139e20d45d8
Author:   Florian Lasinger <[email protected]>
Date:     Fri Feb 19 13:57:05 2021 +0100
Branches: main openj9-jit openj9-jit-exclude

    [#92] Dependency artifacts have higher precedence than reactor artifacts

    (cherry picked from commit f32367b3 + additional comment)

The output is not auto-paged. I do not like this solution much, but it is the best I could come up with, playing for a little while.


Update: Like FelipeC mentioned in his answer, of course you can also transform the shell function into a stand-alone shell script, name it git-log-branches and put it anywhere in your PATH, so Git can find it:

#!/usr/bin/bash

git log --pretty=format:"commit %H%nAuthor:   %an <%ae>%nDate:     %ad%nBranches: %H%n%n%w(76,4,4)%B" "$@" |
  sed -E "s/^(Branches: )(.*)/echo -n '\1'; git branch --contains \2 --format='%(refname:short)' | tr '\n' ' '/e"

Then you call it with git log-branches -2 HEAD~100 in order to get the exact same log output as above when calling the corresponding shell function.


I also compared log calls with regard to timing:

# OK, we have a history of 577 commits
$ git log --oneline | wc -l
577

# Generating a standard Git log is really quick!
$ time (git log | wc -l)
5471

real    0m0.100s

# As expected, kriegaex's solution is way slower. This is the
# price you pay if you want the branches for each commit.
$ time (git log-branches | wc -l)
6591

real    0m56.953s

# The performance of FelipeC's solution is in the same order of magnitude,
# just slightly slower. No big deal. Both solutions could be tweaked
# here or there.
$ time (git log-branches2 | wc -l)
6592

real    1m8.982s

Update regarding -G foo: When using -G, of course the filtering in Git as such takes most of the time, and the result are way fewer commits to process. Therefore, measuring overall performance - not talking about time to display the first hit here - does not really help. My measures above are more meaningful, because they affect all commits which would be processed in a log with more entries. But FWIW, now the numbers look like this:

$ time (git log -G foo --oneline | wc -l)
19

real    0m0.401s

$ time (git log-branches -G foo | wc -l)
216

real    0m2.683s

$ time (git log-branches2 -G foo | wc -l)
217

real    0m2.691s

Again, both solutions take similar amounts of time, but more than normal git log. No surprises here.

Now let us switch to a bigger OSS project - I chose Eclipse AspectJ - and search for -G foo there. Luckily, out of 8607 commits on the main branch, 707 contain "foo", i.e. not just a handful. So a performance measurement actually says something meaningful about the performance of the two scripts vs. Git itself:

# 8,607 commits in total
$ time (git log --oneline | wc -l)
8607

real    0m0.464s

# 707 commits contain "foo"
$ time (git log -G foo --oneline | wc -l)
707

real    0m55.196s

# Script by kriegaex
$ time (git log-branches -G foo | wc -l)
5450

real    1m15.749s

# Script by FelipeC
$ time (git log-branches2 -G foo | wc -l)
5451

real    1m19.283s

Here, we can see that of 75 (kriegaex) or 79 (FelipeC) seconds necessary to process all 707 commits, which really is quasi the same IMO, 55 seconds alone are used by Git to actually filter the big repository with 8,607 commits in the branch in question. I.e. 55/75 or 73,3% of the time is used by Git, and we are no longer comparing the actual script performance alone. The ratio would asymptotically creep up to 100%, the fewer commits are found by -G. This is why initially, I measured without the costly -G option, because it only pollutes the result.

Upvotes: 2

LeGEC
LeGEC

Reputation: 51820

One way to get the branches that contain said commits is to add --simplify-by-decoration :

# to make sense of how branches relate one to another :
git log --graph --oneline --simplify-by-decoration -G foo

# you can also add '-p' to see the content of selected commits, and you will see that
# the commits are not named individually

# to get just the ref names :
git log --format="%D" --simplify-by-decoration -G foo

The above commands will list any ref (local branches, remote branches, tags, stash ...) that appear in your history.
You can add --decorate-refs=refs/heads to list only local branches (or --decorate-refs=whatever/suits/your/needs).


[update]

I just found out git log has a --source option (which has been around since version 1.6.1 ...) :

--source
Print out the ref name given on the command line by which each commit was reached.

So :

# to list local branches :
git log --branches -G foo

# or, to list remote branches :
git log -G foo $(git branch -r --format="%(refname:short)")

may give you interesting results.

Again, as noted by @kriegaex, this doesn't list all branches that contain each commit, it lets git select one branch name to be displayed, and this choice may not match your expectation.

By giving an explicit list of branches (or tags, or ...) to git log, you can narrow the list of names git can choose from.

Upvotes: 1

Related Questions