Nasif Imtiaz Ohi
Nasif Imtiaz Ohi

Reputation: 1713

Parse git log using gitpython

In python, I want to get log of all commits for a file in a git repository and parse the information in the log (hash, author name, author mail, author date, committer name, committer mail, commit date, and commit message). Currently, I can get the raw git log using either gitpython or calling shell commands through subprocess.

Using gitpython:

g=git.Git(path)
loginfo=g.log("--pretty=fuller",'--follow',"<filename>")

Using subprocces call:

lines = subprocess.check_output(
        ['git', 'log','--follow',"--pretty=fuller"," 
         <filename"],stderr=subprocess.STDOUT)

However, after that I want to parse the raw log but I am unable to find a suitable library/method in gitpython for that. Also, I would want the dates to be parsed in python datetime format as well. Can you help?

Upvotes: 3

Views: 15733

Answers (1)

azzamsa
azzamsa

Reputation: 2125

You can get all the repository commits using:

import git
repo = git.Repo("/home/user/.emacs.d")
commits = list(repo.iter_commits("master", max_count=5))

Then you can identify yourself what kind of data gitpython offers:

dir(commits[0])

Some of them are:

  • author
  • committed_datetime
  • hexsha
  • message
  • stats

Take an example:

>>> commits[0].author
<git.Actor "azzamsa <[email protected]>">

>>> commits[0].hexsha
'fe4326e94eca2e651bf0081bee02172fedaf0b90'

>>> commits[0].message
'Add ocaml mode\n'

>>> commits[0].committed_datetime
datetime.datetime(1970, 1, 1, 0, 0, 0, tzinfo=<git.objects.util.tzoffset object at 0x7fb4fcd01790>)

(committed_datetime outputs datetime object with locale object)

If you want to check if a commit contains a file (which is usable if you want to grab all commit from that file). You can use:

def is_exists(filename, sha):
    """Check if a file in current commit exist."""
    files = repo.git.show("--pretty=", "--name-only", sha)
    if filename in files:
        return True

Then to get all commit from a file:

def get_file_commits(filename):
    file_commits = []
    for commit in commits:
        if is_exists(filename, commit.hexsha):
            file_commits.append(commit)

    return file_commits

e.g I want to take all commits from 'init.el' file:

initel_file_commits = get_file_commits('init.el')

>>> initel_file_commits
[<git.Commit "fe4326e94eca2e651bf0081bee02172fedaf0b90">, <git.Commit
"e4f39891fb484a95ea76e8e07244b908e732e7b3">]

See that the function working correctly:

>>> initel_file_commits[0].stats.files
{'init.el': {'insertions': 1, 'deletions': 0, 'lines': 1}, 'modules/aza-ocaml.el': {'insertions': 28, 'deletions': 0, 'lines': 28}}

>>> initel_file_commits[1].stats.files
{'init.el': {'insertions': 1, 'deletions': 0, 'lines': 1}, 'modules/aza-calfw.el': {'insertions': 65, 'deletions': 0, 'lines': 65}, 'modules/aza-home.el': {'insertions': 0, 'deletions': 57, 'lines': 57}}

Hope it helps.

Upvotes: 8

Related Questions