Abhisek
Abhisek

Reputation: 5264

How git stores a commit with respect to tree and blob

When someone commits, how the tree objects and blob objects are laid out for that commit ?

Example

Suppose I have a tree structure like the following

.  
|____dir1  
| |____file_dir1  
| |____newdir  
| | |____file_newdir  
|____dir2  
| |____file_dir2  
|____file1  
|____file2  
|____file3  

According to this, it will create a blob for every file present in the tree structure. The link also tells that Apart from creating a blob it also creates a tree object.

Now the question arises whether a single tree object is created or multiple. Let's say multiple then Intuitively it may be creating 3 tree objects per commit for the above project structure as there are three directories in the project structure and each tree object will be pointing to each blob object(Note that each blob is corresponding to each file in the repository).

Now if each blob is corresponding to each file then why it is just not called as file ? why blob ?

Questions

Upvotes: 0

Views: 171

Answers (1)

Mr_and_Mrs_D
Mr_and_Mrs_D

Reputation: 34076

  • One tree for each directory - the tree object in the commit is the root dir and it contains pointers to blobs and the other trees.
  • git reuses blobs/trees if nothing changed. It also at some point will offer to gc which means (among others) it will compress blobs and store diffs instead of the whole blobs
  • A "blob" object is nothing but a chunk of binary data. - a file has a filename, many different identical files may refer to the same blob
  • As mentioned git will reuse blobs for identical files and will compress blobs (loose objects) to Packfiles at some point (blobs are compressed with zlib to begin with) - git is very efficient (was built with efficiency (space and time) in mind)

See also Git for Computer Scientists and the chapter 10 referenced in comments

Upvotes: 2

Related Questions