Reputation: 99408
The manpage of git diff
says it
NAME
git-diff - Show changes between commits, commit and working tree, etc
DESCRIPTION
Show changes between the working tree and the index or a tree, changes between the index and a tree, changes between two trees, changes between two blob objects, or changes between two files on disk.
Does "the working tree" mean the working directory?
What does "a tree" mean here? Is it the same as a commit object or a tree object? (Literally, I think it means a tree object. But I guess it may intend to mean a commit object, by comparing the "DESCRIPTION" part to the "NAME" part.)
How do you specify "a tree" as a command line argument to git diff
?
If I may also ask, how do you specify "a blob object" as a command line argument to git diff
?
Upvotes: 4
Views: 4824
Reputation: 488013
The word tree is rather overloaded in Git (well, and, in computing in general).
The work-tree or working tree (or other variations of this spelling) refers to the place in which you do your work. Here, files have their normal everyday form, and are readable and—OS willing—writable. (On a Unix system, if you chmod -w
your files, you won't be able to write them. That's not Git's fault, though.)
A tree object, in Git, is an internal data structure that records a directory tree or sub-tree. It contains one entry per file or sub-directory (or, for submodules, a gitlink entry for that submodule). Each entry lists the file's executable-mode bit, as a sort of yes or no flag that's encoded weirdly,1 plus the file's name and blob hash ID. For a sub-tree, the entry lists the directory's name and the subtree object hash ID. Git can then recursively work through the sub-tree object to find more files and yet more sub-trees as needed. Each file entry gives a hash ID for a Git internal blob object, which is a frozen (read-only) compressed copy of the file's data.
Every commit saves one (1) internal Git tree object hash ID. That tree object contains the snapshot that the commit contains—so a commit's snapshot is really one of these trees, which contains entries for files and subtrees. Since each commit has exactly one tree, Git can convert from a commit-specifier to a tree object:
$ git rev-parse master
3c31a203fbeedb4d746889dc77cbafc395fc6e92
$ git rev-parse master^{tree}
5c4b695f5d5606976f5b72e1a901ed17db30a359
In this case, the commit identified by master
is that first big ugly hash, but the internal tree object that this commit is using to hold the files is the second one.
Hence, a work-tree contains real files with real data, and a Git tree object allows Git to find all the frozen files of a commit, provided you give the tree object that corresponds to some commit. The git diff
command needs to compare two things. Those two things can either be two individual files—this is sort of a degenerate case—or two trees of files. When comparing two trees, whether they're tree objects or a work-tree full of files, git diff
will:
This is still just an overview, because git diff
can do more than just these things, but those are the basics.
There's one more very important wrinkle: git diff
can inspect the index and treat it as a tree. The index holds copies of files taken from somewhere. That somewhere is, initally, whatever commit you git checkout
-ed. However, you can git add
files from the work-tree to copy them into the index, replacing the version that was there from the commit. You can git add
files from the work-tree that were never in the commit, and are thus new to the index. And, you can git rm
files, with or without --cached
, to take files out of the index.
Since Git will build a new commit from whatever is in the index at the time you run git commit
, comparing the index contents to something—a frozen tree from a commit, or the work-tree—is a very useful thing indeed.
1The actual tree entries store (mode, path, hash) triples. The mode
is a string: 100755
for an executable file, 100644
for a non-executable file, 40000
for a sub-tree, 120000
for symbolic link, and 160000
for a gitlink. These were originally Linux's stat
st_mode
fields, and Git allowed 100664
for rw-rw-r--
for instance, but that turned out to be a mistake, so a normal tree only uses one of the limited subset. Git still supports 100664
since there may be some Git repositories that still have such entries, but unless you find a really old repository, you won't find any 100664s. The hash is always a blob hash except for gitlink entries, where the hash is the desired commit hash in the submodule.
Upvotes: 9
Reputation: 558
The current working directory is wherever your shell thinks it is. The current working tree starts in the directory with the .git repos. If it says working directory, you might think it moves when you do - it doesn't.
As far as referencing a tree from the git repo, I don't see that terminology in the docs; the only tree I recall seeing is the working tree.
But to get the task you're asking about done, I usually use the signature from the log line of the particular commit. If it's the current commit, then either saying 'HEAD' or the name of your branch works. If it's the head of a different branch, naming that branch can work. If it's tagged, the tag name works. There's also HEAD^1 for the prior commit.
Upvotes: 1