Get commits per branch in git "update" hook

Question

Consider next pseudo-code thats create a required git log:

git checkout master
git checkout -b 123-test-branch-1
git commit -m "#123 b1 c1"
git commit -m "#123 b1 c2"
git push
git checkout master
git checkout -b 456-test-branch-2
git commit -m "#456 b2 c1"
git commit -m "#456 b2 c2"
git push
git checkout 123-test-branch-1
git merge 456-test-branch-2
git commit -m "#123 b1 c3"
git push

In a real-world my update hook in remote git repository validates branch name and commit message formats. Branch name and commit message must contain issue number, for example, in 123-test-branch-1 and #123 b1 c1 the issue number is 123. When branch is pushed, hook extract issue number from branch and commit message and compare it. If they are not equal, hook exits with error.

This works great, when I push branch that has only "own" commits. But, git log example above, pushed branch 123-test-branch-1 has commits from merged branch 456-test-branch-2 so hook try to compare all commits from both branches only with pushed branch 123-test-branch-1 and exits with error because commits from 456-test-branch-2 has issue number 456, when 123 is expected.

To receive commits, I use git log --pretty=%s ${oldRef}..${newRef}, where oldRef and newRef is "update" hook arguments.

So, my question is how to solve this problem. Somehow group commits per branch, or filter commits from branch that pushed now (but if 456-test-branch-2 is local branch and never pushed and never validated, hook may skip invalid commits), or something else.

torek · Accepted Answer

The update hook does not get enough information: it cannot get a "global view" of the incoming hash IDs. A pre- or post-receive hook does,¹ and therefore does get enough information—at least for some purposes.

The biggest problem lies with new branch creation. Suppose, for instance, an update is delivering the names refs/heads/a and refs/heads/b, where both names are new (their old hashes are the null hash), and refs/heads/a points to commit N2 and refs/heads/b points to commit N3 in this graph fragment:

                 N2   <-- A
                /
...--O--O--O--N1
                \
                 N3   <-- B

where all the O commits are "old" (as in, were reachable from existing branch or tag names before) and the N commits are "new", as in were never reachable before, and are therefore listed by:

git rev-list refs/heads/a refs/heads/b --not \
    $(git for-each-ref --format '%(refname) |
        egrep -v '^(refs/heads/a|refs/heads/b)$')

It's clear that these three N commits are "new", but to which branch should you assign N1?

There is no single right answer to this. Commit N1 is on both branches, after all.

In any case, if you are more concerned with merge commits—as in, e.g.:

...O1--O2--N1--N2   <-- A
              /
...-O3--O4--N3    <-- B

—you may want to use --first-parent traversals. Here we can believe, based on these two branch-name updates (A moves from O2 to N2, B moves from O4 to N3)—that the first parent of N2 is N1 (it's possible, but difficult, to make this happen the other way around), so following --first-parents will "assign" commit N1 to A and not to B. Again, if you are doing this from an update hook, rather than a pre- or post-receive hook, that may be the best you can do, since you do not get the information that both A and B are proposed to be updated.

¹A post-receive hook is run after dropping all the locks, so it races against other operations that may update reference names. A pre-receive hook gets all the proposed updates and therefore there is a big lock around reference name updates, so it's clearly safer, in some sense, to do this work there.

The drawback is that the pre-receive hook runs while holding a big lock, so anything "slow" it does, prohibits parallelism.

Get commits per branch in git "update" hook

Answers (1)

Related Questions

Get commits per branch in git &quot;update&quot; hook

Answers (1)

Related Questions

Get commits per branch in git "update" hook