Can the git tree hash be used to uniquely identify a specific state in a subdirectory

Question

I would like to use the hashes used in the git tree object to uniquely identify the state of a subdirectory under git, specifically as an index to an artifact repo for build avoidance. Something like this:

# Get hash for source tree in /
git cat-file -p HEAD^{tree}: | awk '$4 == "" { print $3 }'

My question is whether those hashes are truly persistent. Naive experimentation suggests that it works, but I wonder if anyone is doing this for real.

torek · Accepted Answer

The ^{tree} here is not actually doing you any good.

There's an even simpler method:

git rev-parse HEAD:/

e.g.:

$ git cat-file -p HEAD:t | awk '$4 == "t9604" { print $3 }'
846893f0b08b1fa03a6383c9a4deade32c16e929
$ git rev-parse HEAD:t/t9604
846893f0b08b1fa03a6383c9a4deade32c16e929

The SHA-1 of a tree object is a checksum of the "contents" of the tree, which is to say, its tree-ness, plus a list of all the names in the tree along with their mode and type and SHA-1's. So, yes: it identifies the state of the directory. It might identify it too strongly, as it will change if any of the blobs contained in the tree even changes modes (+x or -x).

If you want to make sure that HEAD:path names a tree (not a blob) you can use git cat-file -t on the result of the rev-parse (but that's probably not needed here).

Can the git tree hash be used to uniquely identify a specific state in a subdirectory

Answers (1)

Related Questions