Reputation: 15
I want to list down the file names and their sizes in a git repository for some specific commit.
Does Git
store this data with the git blob
object in the commit tree ?
Upvotes: 0
Views: 38
Reputation: 489588
The short answer is "No, but you don't (need to) care."
A tree object has a pathname component—not the directory path leading up to the component, as that is implied by the series of path components accumulated in order to reach the tree object in the first place—and a blob hash, but not the blob object's size. The size of the blob object is in the first few bytes of the object itself since all objects are encoded beginning with a zero-terminated byte string of the form <typename, space, ASCII-fied size, NUL>.
If, however, you can read any of this stuff to this point—by this, I mean if you have source code that can extract enough of a commit object to locate tree objects and extract them so as to read tree and blob IDs and path names—then you have everything you need to read the size header from the blob as well. If you have enough data to read the tree and blob objects from the repository, you have the whole repository, at least to the interesting depth (you might have a shallow clone but it's at least deep enough to have found the commit to find the trees and the blobs). This means you can find both the path names, by traversing the tree objects, and the blob sizes, by reading the blob headers.
Upvotes: 1