user2664470
user2664470

Reputation: 801

Git: get a hash of the current state of the working tree?

I would like to ensure that my executable is built with the most up to date version of the code.

For example, I can take the current git commit at the time of compile and bake it into the executable; then when the executable is run, it compares this with the current git commit and if they don't match it complains that the code has been modified and that it is out of date.

However, sometimes I recompile without making a commit, after making small changes to a the code. Then this method doesn't work, as it only accounts for committed changes.

Is there any convenient way to programatically get a hash of the current commit PLUS the state of the working directory, using git or otherwise?

Also, is there a name for this practice?

Upvotes: 14

Views: 3693

Answers (2)

dinvlad
dinvlad

Reputation: 1294

It is possible to create and store a majority of changes in the current working tree, including all staged, unstaged and untracked files, while respecting .gitignore. Roughly, one needs to

#!/bin/sh
{   git diff-index --name-only HEAD
    git ls-files -o --exclude-standard
} \
| while read path; do
    test -f "$path" && printf "100644 blob %s\t$path\n" $(git hash-object -w "$path");
    test -d "$path" && printf "160000 commit %s\t$path\n" $(cd "$path"; git rev-parse HEAD);
done | sed 's,/,\\,g' | git mktree --missing

The first diff lists all tracked files different from HEAD.

Then we find the untracked ones, but exclude the ignored.

We then pipe output of these two commands into a loop tnat constructs git mktree input for all the files.

The output of that goes through sed because git mktree doesn't recursively construct trees, but the actual paths here don't matter since we just want a hashcode, none of the actual content is ever stored for retrieval.

Finally, we pass this ls-tree-formatted output to mktree, which constructs the specified tree and stores it in Git, outputting the hash to us.

With a bit of extra effort one can also keep information about permissions and possibly even file deletions. After all, this is what Git does when you do an actual commit.

One can argue that all these hoops are useful in situations when you do want to store your changes for future reference but don't want to pollute the index with unnecessary commits for every little change. As such, it may be useful for internal testing with micro-releases, where you can log the local hash as the actual version of your code instead of just the non-descriptive -dirty flag, to see where exactly your code failed when you forgot to tag or commit it for each working version. Some may consider this to be a bad habit that should instead force you to do commit for every successful build, however small - it's hard to argue with that, but then again it's all about convenience.

Upvotes: 6

David
David

Reputation: 3113

If all you want to do is determine whether there are any uncommitted modifications, that's easy; just run git diff --quiet HEAD and check whether the return code is non-zero.

If you actually need a hash of the changes, so that two users with the same starting commit and the same local modifications will get the same hash, that's trickier. My first thought is to pipe the output of git diff HEAD into sha1sum, and concatenate it to the commit hash, but the output of git diff might vary for different Git versions and config options.

Alternatively, you could use git add -u . && git write-tree to get an honest-to-goodness Git tree object for the current working tree. But that's a destructive operation; it clobbers any partially-staged changes that were already in your index.

Upvotes: 4

Related Questions