bee
bee

Reputation: 253

finding first commit of a file in git

The situation: I have a tar.gz of a release from a github project but would like to work out which commit this was taken from. It doesn't appear to have been tagged or is it obvious from commit message themselves.

So I can calculate sha1 of the files, but would like to work out which commit these belong to?

Calling git wizards!

Upvotes: 11

Views: 2314

Answers (2)

Adam Dymitruk
Adam Dymitruk

Reputation: 129526

This method may be tricky due to file attributes. Assuming they are unchanged or you look at what the repo stores, ensure they are the same. Commit this to the repository and then take a look at the hash of the tree.

git show -s --pretty=format:%T HEAD

Now walk all commits in the repo and see if any of them have a tree of the same hash.

git log --all --format=%H

will give you all the commit hashes. Now pipe this to show the tree hash

git log --all --format=%H \
  | xargs -n 1 git show -s --pretty='format:%H %T' \
  | gerp <hash of your tree>

If the tar contained exactly the same structure including permissions, the output will show the SHA1s of the commits that have the same tree.

Searching for the top level tree SHA1 will be FAST.

Upvotes: 2

Borealid
Borealid

Reputation: 98459

Since the git-stored hash doesn't just include the file contents (and, in theory, hash collisions happen anyhow), in order to be really sure you've got the right version of the file you need to compare the contents.

for rev in $(git log --format=%H -- /path/to/file); do
   git diff --quiet $x:/path/to/file my-current-file;
   if [[ $? -eq 0 ]]; then
      echo $x;
   fi
done

In English: iterate over the revisions that changed the file, in reverse order. For each such revision, diff the version of the file there with the outside-the-tree file. If the two files are identical, print the revision hash.

If you want to do this for the whole tarball, you can do the same but diff the whole tree instead of a single file (and omit the file path as an argument to git log) - use whatever tolerant diff options you like.

Upvotes: 2

Related Questions