maxmelbin
maxmelbin

Reputation: 2095

how does Git handle deltas of files

from what i could figure out, Git stores content of each version of the file in a blob, referenced in a tree with the filename etc.. If there are 3 files with exactly the same content, these three files are referenced to the same blob. A new version of a file has a new blob with entire content of the file.

Now, since Git doesn't store deltas but store the entire contents of the file for each version in a seperate blob object, will this result in increased storage? Is this a major issue to be considered for deciding on Git for a project? Also, is my understanding of the Git handling of versions correct?

Upvotes: 0

Views: 139

Answers (1)

Michael Slade
Michael Slade

Reputation: 13877

At one level, git stores objects as simple files indexed by their hashes. If a new version of a file is checked in, it gets a new hash and thus a new blob with a completely new file.

At a level under that, git can combine and compress blobs into "packs" blobs in these packs that are similar can be compressed with delta compression. Apart from requiring the user to occasionally type

git repack -a -d

or similar, the process is transparent to the user and the storage structure.

try the above command on your resository and see how it affects the size of your .git directory.

Upvotes: 1

Related Questions