Reputation: 25
Yesterday I started learning about git. But I got baffled seeing two contradictory repository definitions.
1st: A repository is a directory which contains your project. A repository is made up of commits.
2nd: A repository is the .git folder inside your project.
Do the two statements actually convey the same thing? Then how do they?
I've seen the .git hidden folder which is certainly not my project.
Upvotes: 0
Views: 116
Reputation: 489848
As Wim Coenen put it, the second definition—that the repository is the stuff inside the .git
directory—focuses on the organization.
But, formally speaking, I have to agree with the second definition. The remaining area—the area where you do your work—is not part of the repository itself. It is merely next to the repository.
The reason for this is that the stuff inside the .git
folder is Git's. You can look at it, and if you understand Git's internals—which change from one Git release to another, as Git evolves over time—you can even edit things directly here. But in general, you should leave this stuff to Git itself.
The files that are not in the .git
folder are yours. You can do whatever you like with them. Git will fill in your work area, from a commit, when you ask it to.
The short version, then, is that you work in your work-tree. This area is yours, to do whatever you like. You then tell Git, at various points in time: do something. That something can:
This distinction—between your work area, which is not part of the repository proper, and Git's area, which actually holds the repository—becomes even more important if you use the git worktree
command, first added in Git 2.5. In particular, you can use git worktree add
to create additional work-trees. Each such work-tree is not in the repository, and in fact, you can simply remove such a work-tree when you are done with it.
(Git calls your work area a working tree or work-tree. This is why the command that adds a new work-tree is git worktree add
.)
The main theme with Git itself is that Git stores commits. Each commit in turn stores files. In fact, each commit holds a full snapshot of all files. Git's stored files use de-duplication, since most commits mostly hold the same versions of files as some other commit. They're also stored in a special, read-only, Git-only format. Only Git can actually read these files. That's why Git extracts the files to your work-tree.
The part that is particularly odd is that when Git makes new commits—which is how you have Git store the updated files, after you've updated them—it makes them from copies that aren't the copies in your work-tree! If you have ever used Mercurial, which is otherwise a lot like Git, this can be kind of baffling. In Mercurial, hg commit
makes a new commit from the files in your work-tree. This is simple and clear. But git commit
makes the new commit from files that are in Git's index, instead of the files in your work-tree. You must keep using git add
to copy any files you have updated, back into Git's index.
Hence, Git's index—which Git also calls the staging area—is what holds your proposed next commit. In Mercurial, which is easy to use, your work-tree holds your proposed next commit. In Git, the proposed next commit starts out matching the current commit. As you change files in your work-tree, you must copy the changed files back into Git's index, to change the proposed next commit.
(Git's method of making new commits gives you flexibility that is harder to achieve in Mercurial, at the cost of requiring a lot of git add
commands.)
Note: in modern Git, it is possible to separate Git's repository—the .git
folder—from your work-tree, using git init --separate-git-dir
. I don't know of anyone who uses this in ordinary everyday work, though.
Upvotes: 1
Reputation: 66783
Both definitions focus a bit too much on what a repository looks like on your local filesystem.
Conceptually, a repository is version controlled file tree. It contains snapshots (or "commits") of different points in time and different development branches of the same project.
When a repository is cloned locally, everything is contained in one folder. The data needed to reconstruct all the different snapshots resides in the .git subfolder. The rest of the folder represents a certain snapshot of the project, plus any uncommitted changes that you are currently making to it. At any moment, you can decide to create a new snapshot by doing a "commit". Users can share snapshots by pushing/pulling them to/from remote repositories.
The snapshots are linked together, so if you get one then you also recursively get all the other ones that it was based on. This allows you to examine the entire history of the project leading up to that state.
Upvotes: 2