Unzip git tree object

Question

I've written a small groovy utility that can unzip git blob objects and it works, I can see the content of the blobs. The same works for the commits.

The problem is in trees. When I unpack them, I get: tree 29100644 a�⛲��CK�)�wZ��S�. As you can see after the object size it's impossible to read the content. It looks like this content is kept in a different format.

Here is my code:

   ByteArrayOutputStream result = new ByteArrayOutputStream()
   InflaterOutputStream byteWriter = new InflaterOutputStream(result)
   byteWriter.write(new File(input).bytes)
   byteWriter.close()
   println result

Tried similar things in Ruby and the result was the same. So I think the problem is in the format of the file which is not Zlibbed.

VonC · Accepted Answer

But the tree content isn't meant to be a readable string, if I follow the article "Git tree objects, how are they stored?":

The general format is:

First 4 bytes declaring the object type. In our case, those four bytes are “tree”, ASCII-encoded.

Then comes a space,

and then the entries, separated by nothing.

The exact format is the following. All capital letters are “non-terminals” that I’ll explain shortly.

tree ZN(A FNS)*

where:

N is the NUL character

Z is the size of the object in bytes

A is the unix access code, ASCII encoded, for example> 100644 for a vanilla file.

F is the filename, (I’m not sure about the encoding. It’s definitely ASCII-compatible), NUL-terminated.

S is the 20 byte SHA hash of the entry pointed to, 20 bytes long.

Here’s an example.
Say we have a directory with two files, called test and test2. The SHA of the directory is f0e12ff4a9a6ba281d57c7467df585b1249f0fa5. You can see the SHA-hashes of the entries in the output of

$ git cat-file -p f0e12ff4a9a6ba281d57c7467df585b1249f0fa5
100644 blob 9033296159b99df844df0d5740fc8ea1d2572a84    test
100644 blob a7f8d9e5dcf3a68fdd2bfb727cde12029875260b    test2

tree

Unzip git tree object

Answers (1)

Related Questions