Ant's
Ant's

Reputation: 13811

What is the use of object directory in git?

I'm learning git currently and and I have a git repo where I can see a .git directory.

In that directory I can see certain files and folders which I can understand.

But there is one directory named objects. I couldn't able to think of what they are for.

For example my objects directory has:

04  4a  5f  7a  e9  f2  info  pack

and for example if see the contents of 04 I can see a file named 12697515217f658b245149a986aba32fa98f38. I couldn't able to see its contents, it been encrypted.

Can anyone say why the objects folders are for? Why a big hash no like 12697515217f658b245149a986aba32fa98f38?

In any case I can decrypt the contents of 12697515217f658b245149a986aba32fa98f38 and really see how git does its magic?

Thanks in advance.

Upvotes: 5

Views: 3927

Answers (2)

Faizan Makandar
Faizan Makandar

Reputation: 91

The objects directory contains all the repository files. This is where the git stores the backup of files, the commits in the repository, and more. The files are all encrypted and compressed and they wont look like much. There are four kinds of objects present in git.

  1. commit
  2. tree
  3. blob
  4. annoted tag

In order to understand these we need to know what hashing functions are. Hashing functions are functions that map input data of arbitrary size to fixed size output values.

For example:

"I love StackOverflow" --> Hashing Function
                                  |
              dd76fc997fe194a71fe545fb51ff622762a293ff

The main point to learn is Hashing function will always give the output dd76fc997fe194a71fe545fb51ff622762a293ff whenever the content is "I love StackOverflow".

Hence, we can infer that hashing functions are deterministic. You can check the hash output for any content or for "I love StackOverflow" by using below command in your repository:

echo "I love StackOverflow" | git hash-object --stdin

Coming to the question whether you can decrypt the hash code. Yes there is a way for that. First of all you must store the hash of the content. For example,

echo "I love StackOverflow" | git hash-object --stdin -w

By adding -w at the end our hash will be stored and we can retrieve it by

git cat-file -p dd76fc997fe194a71fe545fb51ff622762a293ff

Now coming to the curiosity why the long hash code even exist in git world, it does because git is responsible to look after all the changes done in any files present in repository. Hence to track these changes, git assigns a long hash code which comes from hash function to simply track each and every minor changes done in our repository.

Each minor changes of any file in repository will be given a hash code the contents of files will be stored in space called as "blob". Blob will not denote the file name but will only contain the content of file with hash code.

Upvotes: 2

In .git/object git stores it's own internal warehouse of blobs, all indexed by SHAs.

Why a big hash no like 12697515217f658b245149a986aba32fa98f38?

It is some representation of a blobs tree. It's faster and more comfortable for different file-systems to keeping all blob directories in that way.

In any case I can decrypt the contents of 12697515217f658b245149a986aba32fa98f38 and really see how git does its magic?

I really hope you can't decrypt it (it is just compressed, not encrypted as @knittl fairly noticed in the comments).

If you are interested in what stores in .git directory have a look at that article http://gitready.com/advanced/2009/03/23/whats-inside-your-git-directory.html .

Upvotes: 4

Related Questions