readonly
readonly

Reputation: 355984

Prevent people from pushing a git commit with a different author name?

In git, it is up to each user to specify the correct author in their local git config file. When they push to a centralized bare repository, the commit messages on the repository will have the author names that they used when committing to their own repository.

Is there a way enforce that a set of known authors for commits are used? The "central" repository will be accessible via ssh.

I know that this is complicated by the fact that some people may be pushing commits that were made by others. Of course, you should also only allow people you trust to push to your repositories, but it would be great if there was a way to prevent user error here.

Is there a simple solution to this problem in git?

Upvotes: 26

Views: 11425

Answers (6)

mrts
mrts

Reputation: 19033

We use Gitlab and so it makes sense for us to validate authors against Gitlab group members.

The following script (based on @dsvensson's answer) that should be installed as pre-receive hook does exactly that:

from __future__ import print_function
from __future__ import unicode_literals

import sys
import os
import subprocess
import urllib2
import json
import contextlib
import codecs
from itertools import islice, izip

GITLAB_SERVER = 'https://localhost'
GITLAB_TOKEN = 'SECRET'
GITLAB_GROUP = 4
EMAIL_DOMAIN = 'example.com'

def main():
    commits = get_commits_from_push()
    authors = get_gitlab_group_members()
    for commit, author, email in commits:
        if author not in authors:
            die('Unknown author', author, commit, authors)
        if email != authors[author]:
            die('Unknown email', email, commit, authors)

def get_commits_from_push():
    old, new, branch = sys.stdin.read().split()
    rev_format = '--pretty=format:%an%n%ae'
    command = ['git', 'rev-list', rev_format, '{0}..{1}'.format(old, new)]
    # branch delete, let it through
    if new == '0000000000000000000000000000000000000000':
        sys.exit(0)
    # new branch
    if old == '0000000000000000000000000000000000000000':
        command = ['git', 'rev-list', rev_format, new, '--not', '--branches=*']
    output = subprocess.check_output(command)
    commits = [line.strip() for line in unicode(output, 'utf-8').split('\n') if line.strip()]
    return izip(islice(commits, 0, None, 3),
            islice(commits, 1, None, 3),
            islice(commits, 2, None, 3))

def get_gitlab_group_members():
    url = '{0}/api/v3/groups/{1}/members'.format(GITLAB_SERVER, GITLAB_GROUP)
    headers = {'PRIVATE-TOKEN': GITLAB_TOKEN}
    request = urllib2.Request(url, None, headers)
    with contextlib.closing(urllib2.urlopen(request)) as response:
        members = json.load(response)
    return dict((member['name'], '{}@{}'.format(member['username'], EMAIL_DOMAIN))
        for member in members)

def die(reason, invalid_value, commit, authors):
    message = []
    message.append('*' * 80)
    message.append("ERROR: {0} '{1}' in {2}"
            .format(reason, invalid_value, commit))
    message.append('-' * 80)
    message.append('Allowed authors and emails:')
    print('\n'.join(message), file=sys.stderr)
    for name, email in authors.items():
        print(u"  '{0} <{1}>'".format(name, email), file=sys.stderr)
    sys.exit(1)

def set_locale(stream):
    return codecs.getwriter('utf-8')(stream)

if __name__ == '__main__':
    # avoid Unicode errors in output
    sys.stdout = set_locale(sys.stdout)
    sys.stderr = set_locale(sys.stderr)

    # you may want to skip HTTPS certificate validation:
    #  import ssl
    #  if hasattr(ssl, '_create_unverified_context'):
    #    ssl._create_default_https_context = ssl._create_unverified_context

    main()

See GitLab custom Git hooks docs for installation instructions.

Only get_gitlab_group_members() is Gitlab-specific, other logic applies to any pre-receive hook (including handling branch deletions and creations).

The script is now available in GitHub, please feel free to send pull requests for any mistakes/improvements.

Upvotes: 2

Anders Waldenborg
Anders Waldenborg

Reputation: 3035

Use the PRE-RECEIVE hook (see githooks(5) for details). There you get old sha and new sha for each ref updated. And can easily list the changes and check that they have proper author (git rev-list --pretty=format:"%an %ae%n" oldsha..newsha).

Here is an example script:

#!/bin/bash
#
# This pre-receive hooks checks that all new commit objects
# have authors and emails with matching entries in the files
# valid-emails.txt and valid-names.txt respectively.
#
# The valid-{emails,names}.txt files should contain one pattern per
# line, e.g:
#
# ^.*@0x63.nu$
# ^[email protected]$
#
# To just ensure names are just letters the following pattern
# could be used in valid-names.txt:
# ^[a-zA-Z ]*$
#


NOREV=0000000000000000000000000000000000000000

while read oldsha newsha refname ; do
    # deleting is always safe
    if [[ $newsha == $NOREV ]]; then
    continue
    fi

    # make log argument be "..$newsha" when creating new branch
    if [[ $oldsha == $NOREV ]]; then
    revs=$newsha
    else
    revs=$oldsha..$newsha
    fi
    echo $revs
    git log --pretty=format:"%h %ae %an%n" $revs | while read sha email name; do
    if [[ ! $sha ]]; then
        continue
    fi
        grep -q -f valid-emails.txt <<<"$email" || {
            echo "Email address '$email' in commit $sha not registred when updating $refname"
            exit 1
        }
        grep -q -f valid-names.txt <<<"$name" || {
            echo "Name '$name' in commit $sha not registred when updating $refname"
            exit 1
        }
    done
done

Upvotes: 10

dsvensson
dsvensson

Reputation: 1421

We use the following to prevent accidental unknown-author commits (for example when doing a fast commit from a customer's server or something). It should be placed in .git/hooks/pre-receive and made executable.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import subprocess
from itertools import islice, izip
import sys

old, new, branch = sys.stdin.read().split()

authors = {
    "John Doe": "[email protected]"
}

proc = subprocess.Popen(["git", "rev-list", "--pretty=format:%an%n%ae%n", "%s..%s" % (old, new)], stdout=subprocess.PIPE)
data = [line.strip() for line in proc.stdout.readlines() if line.strip()]

def print_error(commit, author, email, message):
    print "*" * 80
    print "ERROR: Unknown Author!"
    print "-" * 80
    proc = subprocess.Popen(["git", "rev-list", "--max-count=1", "--pretty=short", commit], stdout=subprocess.PIPE)
    print proc.stdout.read().strip()
    print "*" * 80
    raise SystemExit(1)

for commit, author, email in izip(islice(data, 0, None, 3), islice(data, 1, None, 3), islice(data, 2, None, 3)):
    _, commit_hash = commit.split()
    if not author in authors:
        print_error(commit_hash, author, email, "Unknown Author")
    elif authors[author] != email:
        print_error(commit_hash, author, email, "Unknown Email")

Upvotes: 10

webmat
webmat

Reputation: 60646

If you want to manage rights to an internet facing git repo, I suggest you look at Gitosis rather than whipping up your own. Identity is provided by private/public key pairs.

Read me pimping it here, too.

Upvotes: 0

davetron5000
davetron5000

Reputation: 24891

git wasn't initially designed to work like svn with a big central repository. Perhaps you can pull from people as needed, and refuse to pull if they have their author set inaccurately?

Upvotes: 0

Armin Ronacher
Armin Ronacher

Reputation: 32563

What you could do is create a bunch of different user accounts, put them all in the same group and give that group write access to the repository. Then you should be able to write a simple incoming hook that checks if the user that executes the script is the same as the user in the changeset.

I've never done it because I trust the guys that check code into my repositories, but if there is a way, that's probably the one explained above.

Upvotes: 0

Related Questions