Reputation: 355984
In git, it is up to each user to specify the correct author in their local git config file. When they push to a centralized bare repository, the commit messages on the repository will have the author names that they used when committing to their own repository.
Is there a way enforce that a set of known authors for commits are used? The "central" repository will be accessible via ssh.
I know that this is complicated by the fact that some people may be pushing commits that were made by others. Of course, you should also only allow people you trust to push to your repositories, but it would be great if there was a way to prevent user error here.
Is there a simple solution to this problem in git?
Upvotes: 26
Views: 11425
Reputation: 19033
We use Gitlab and so it makes sense for us to validate authors against Gitlab group members.
The following script (based on @dsvensson's answer) that should be installed as pre-receive hook does exactly that:
from __future__ import print_function
from __future__ import unicode_literals
import sys
import os
import subprocess
import urllib2
import json
import contextlib
import codecs
from itertools import islice, izip
GITLAB_SERVER = 'https://localhost'
GITLAB_TOKEN = 'SECRET'
GITLAB_GROUP = 4
EMAIL_DOMAIN = 'example.com'
def main():
commits = get_commits_from_push()
authors = get_gitlab_group_members()
for commit, author, email in commits:
if author not in authors:
die('Unknown author', author, commit, authors)
if email != authors[author]:
die('Unknown email', email, commit, authors)
def get_commits_from_push():
old, new, branch = sys.stdin.read().split()
rev_format = '--pretty=format:%an%n%ae'
command = ['git', 'rev-list', rev_format, '{0}..{1}'.format(old, new)]
# branch delete, let it through
if new == '0000000000000000000000000000000000000000':
sys.exit(0)
# new branch
if old == '0000000000000000000000000000000000000000':
command = ['git', 'rev-list', rev_format, new, '--not', '--branches=*']
output = subprocess.check_output(command)
commits = [line.strip() for line in unicode(output, 'utf-8').split('\n') if line.strip()]
return izip(islice(commits, 0, None, 3),
islice(commits, 1, None, 3),
islice(commits, 2, None, 3))
def get_gitlab_group_members():
url = '{0}/api/v3/groups/{1}/members'.format(GITLAB_SERVER, GITLAB_GROUP)
headers = {'PRIVATE-TOKEN': GITLAB_TOKEN}
request = urllib2.Request(url, None, headers)
with contextlib.closing(urllib2.urlopen(request)) as response:
members = json.load(response)
return dict((member['name'], '{}@{}'.format(member['username'], EMAIL_DOMAIN))
for member in members)
def die(reason, invalid_value, commit, authors):
message = []
message.append('*' * 80)
message.append("ERROR: {0} '{1}' in {2}"
.format(reason, invalid_value, commit))
message.append('-' * 80)
message.append('Allowed authors and emails:')
print('\n'.join(message), file=sys.stderr)
for name, email in authors.items():
print(u" '{0} <{1}>'".format(name, email), file=sys.stderr)
sys.exit(1)
def set_locale(stream):
return codecs.getwriter('utf-8')(stream)
if __name__ == '__main__':
# avoid Unicode errors in output
sys.stdout = set_locale(sys.stdout)
sys.stderr = set_locale(sys.stderr)
# you may want to skip HTTPS certificate validation:
# import ssl
# if hasattr(ssl, '_create_unverified_context'):
# ssl._create_default_https_context = ssl._create_unverified_context
main()
See GitLab custom Git hooks docs for installation instructions.
Only get_gitlab_group_members()
is Gitlab-specific, other logic applies to any pre-receive hook (including handling branch deletions and creations).
The script is now available in GitHub, please feel free to send pull requests for any mistakes/improvements.
Upvotes: 2
Reputation: 3035
Use the PRE-RECEIVE hook (see githooks(5) for details). There you get old sha and new sha for each ref updated. And can easily list the changes and check that they have proper author (git rev-list --pretty=format:"%an %ae%n" oldsha..newsha).
Here is an example script:
#!/bin/bash
#
# This pre-receive hooks checks that all new commit objects
# have authors and emails with matching entries in the files
# valid-emails.txt and valid-names.txt respectively.
#
# The valid-{emails,names}.txt files should contain one pattern per
# line, e.g:
#
# ^.*@0x63.nu$
# ^[email protected]$
#
# To just ensure names are just letters the following pattern
# could be used in valid-names.txt:
# ^[a-zA-Z ]*$
#
NOREV=0000000000000000000000000000000000000000
while read oldsha newsha refname ; do
# deleting is always safe
if [[ $newsha == $NOREV ]]; then
continue
fi
# make log argument be "..$newsha" when creating new branch
if [[ $oldsha == $NOREV ]]; then
revs=$newsha
else
revs=$oldsha..$newsha
fi
echo $revs
git log --pretty=format:"%h %ae %an%n" $revs | while read sha email name; do
if [[ ! $sha ]]; then
continue
fi
grep -q -f valid-emails.txt <<<"$email" || {
echo "Email address '$email' in commit $sha not registred when updating $refname"
exit 1
}
grep -q -f valid-names.txt <<<"$name" || {
echo "Name '$name' in commit $sha not registred when updating $refname"
exit 1
}
done
done
Upvotes: 10
Reputation: 1421
We use the following to prevent accidental unknown-author commits (for example when doing a fast commit from a customer's server or something). It should be placed in .git/hooks/pre-receive and made executable.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import subprocess
from itertools import islice, izip
import sys
old, new, branch = sys.stdin.read().split()
authors = {
"John Doe": "[email protected]"
}
proc = subprocess.Popen(["git", "rev-list", "--pretty=format:%an%n%ae%n", "%s..%s" % (old, new)], stdout=subprocess.PIPE)
data = [line.strip() for line in proc.stdout.readlines() if line.strip()]
def print_error(commit, author, email, message):
print "*" * 80
print "ERROR: Unknown Author!"
print "-" * 80
proc = subprocess.Popen(["git", "rev-list", "--max-count=1", "--pretty=short", commit], stdout=subprocess.PIPE)
print proc.stdout.read().strip()
print "*" * 80
raise SystemExit(1)
for commit, author, email in izip(islice(data, 0, None, 3), islice(data, 1, None, 3), islice(data, 2, None, 3)):
_, commit_hash = commit.split()
if not author in authors:
print_error(commit_hash, author, email, "Unknown Author")
elif authors[author] != email:
print_error(commit_hash, author, email, "Unknown Email")
Upvotes: 10
Reputation: 60646
If you want to manage rights to an internet facing git repo, I suggest you look at Gitosis rather than whipping up your own. Identity is provided by private/public key pairs.
Read me pimping it here, too.
Upvotes: 0
Reputation: 24891
git wasn't initially designed to work like svn with a big central repository. Perhaps you can pull from people as needed, and refuse to pull if they have their author set inaccurately?
Upvotes: 0
Reputation: 32563
What you could do is create a bunch of different user accounts, put them all in the same group and give that group write access to the repository. Then you should be able to write a simple incoming hook that checks if the user that executes the script is the same as the user in the changeset.
I've never done it because I trust the guys that check code into my repositories, but if there is a way, that's probably the one explained above.
Upvotes: 0