Reputation: 1923
In GitPython, I can iterate separately the diff information for every change in the tree by calling the diff()
method between different commit objects. If I call diff()
with the create_patch=True
keyword argument, a patch string is created for every change (additions, deletions, renames) which I can access through the created diff
object, and dissect for the changes.
However, I don't have a parent to compare to with the first commit.
import git
from git.compat import defenc
repo = git.Repo("path_to_my_repo")
commits = list(repo.iter_commits('master'))
commits.reverse()
for i in commits:
if not i.parents:
# First commit, don't know what to do
continue
else:
# Has a parent
diff = i.diff(i.parents[0], create_patch=True)
for k in diff:
try:
# Get the patch message
msg = k.diff.decode(defenc)
print(msg)
except UnicodeDecodeError:
continue
You can use the method
diff = repo.git.diff_tree(i.hexsha, '--', root=True)
But this calls git diff
on the whole tree with the given arguments, returns a string and I cannot get the information for every file separately.
Maybe, there is a way to create a root
object of some sorts. How can I get the first changes in a repository?
EDIT
A dirty workaround seems to be comparing to the empty tree by directly using its hash:
EMPTY_TREE_SHA = "4b825dc642cb6eb9a060e54bf8d69288fbee4904"
....
if not i.parents:
diff = i.diff(EMPTY_TREE_SHA, create_patch=True, **diffArgs)
else:
diff = i.diff(i.parents[0], create_patch=True, **diffArgs)
But this hardly seems like a real solution. Other answers are still welcome.
Upvotes: 9
Views: 3493
Reputation: 358
the proposed solution of the OP works, but it has the disadvantage that the diff is inverse (added files in the diff are marked as delete, etc). However, one can simply reverse the logic:
from gitdb.util import to_bin_sha
empty_tree = git.Tree(self.repo, to_bin_sha("4b825dc642cb6eb9a060e54bf8d69288fbee4904"))
diff = empty_tree.diff(i)
Be aware that with sha256, the empty tree id is 6ef19b41225c5369f1c104d45d8d85efa9b057b53b14b4b9b939dd74decc5321
You can check the type of the repo with GitPython like so:
def is_sha1(repo):
format = repo.git.rev_parse("--show-object-format")
return format == "sha1"
Upvotes: 2
Reputation: 600
The short answer is you can't. GitPython does not seem to support this method.
It would work to do a git show
on the commit, but GitPython does not support that.
You can on the other hand use the stats
functionality in GitPython to get something that will allow you to get the information you need:
import git
repo = git.Repo(".")
commits = list(repo.iter_commits('master'))
commits.reverse()
print(commits[0])
print(commits[0].stats.total)
print(commits[0].stats.files)
This might solve your problem. If this does not solve your problem you would probably be better off trying to use pygit2 which is based on libgit2 - The library that VSTS, Bitbucket and GitHub use to handle Git on their backends. That is probably more feature complete. Good luck.
Upvotes: 6