Nitish Prajapati
Nitish Prajapati

Reputation: 129

Strange output of "git diff", "git diff HEAD" and "git diff --staged"?

There are 3 main git diff versions:

  • git diff - difference between WORKING DIRECTORY & STAGE
  • git diff --staged - difference between HEAD & STAGE
  • git diff HEAD - difference between HEAD & WORKING DIRECTORY

Above are the definitions I found almost at every place from various people over the net

I performed certain commits on 3 files in the following order:

ce6f5bb (HEAD -> master) 6th commit, file1
c1c67da 5th commit, file3    
ea51776 4th commit, file1 file2    
001675b 3rd commit, file1 file2    
ec04f53 2nd commit, file2    
21cb6c1 1st commit, file1  

a. file2 modified in working direcory
b. Nothing STAGED for commit
c. file1 & file3 not modified


My Queries are:

1. git diff

    git diff
    diff --git a/file2.txt b/file2.txt
    index 21106bf..c755a1e 100644
    --- a/file2.txt
    +++ b/file2.txt
    @@ -1,3 +1,4 @@
     123
     345
     678
    +90.    

Even though STAGING AREA WAS EMPTY, why diff was shown?


2. git diff HEAD

    git diff
    diff --git a/file2.txt b/file2.txt
    index 21106bf..c755a1e 100644
    --- a/file2.txt
    +++ b/file2.txt
    @@ -1,3 +1,4 @@
     123
     345
     678
    +90.    

if LAST COMMIT (HEAD) was related to file1, then why diff of file2 is shown?
HEAD doesn't contain anything related to file2 at all



NOW AFTER STAGING file2 :

3. git diff

It does not show anything!
(I assume it will only show diff if a file is Staged as well as some changes apart from staged version are made in Working Directory too)
Well if that is the case, then why diff was shown in 1.


4. git diff --staged

    git diff
    diff --git a/file2.txt b/file2.txt
    index 21106bf..c755a1e 100644
    --- a/file2.txt
    +++ b/file2.txt
    @@ -1,3 +1,4 @@
     123
     345
     678
    +90.    

again if HEAD is pointing to file1, then why diff of file2 is shown?



I CREATED THIS IMAGE BELOW (Note: ANOTHER SCENARIO. Not same as above):

for git diff HEAD, My guess is that for every TRACKED FILE, HEAD will keep traveling backward until it finds the LATEST VERSION OF THAT FILE which was committed, to compare against the one in Working Directory

If we consider a new scenario as below, then for git diff HEAD, is it like below what I have assumed?

enter image description here

Upvotes: 4

Views: 954

Answers (1)

torek
torek

Reputation: 488619

You're making one fundamental mistake, and then propagating this mistake into each of your various commands.

The mistake is that you are thinking of a commit as a change. A commit is not a set of changes. A commit holds a snapshot of files. Moreover, the staging area is never actually empty,1 it just matches the current commit, initially.

Files file1.txt, file2.txt, and file3.txt exist in:

  • your work-tree, as plain files;
  • the index / staging-area, as files in Git's special commit format, ready to be committed; and
  • each commit.

Each copy of each file can match some other copy of the same file (or any other file), or can be different.

The name HEAD selects one particular commit.2 At the start of your various tests, the name HEAD selected commit ce6f5bb. So there are three files named file1.txt available to you and Git at this point, besides those in earlier commits:

  • ce6f5bb:file1.txt, aka HEAD:file1.txt: this copy of file1.txt is frozen into a commit and cannot be changed.
  • :file1.txt: this copy of file1.txt is in the index / staging area. You can replace it with a new copy at any time.
  • file1.txt: this is just an ordinary file. It's not actually in Git at all. It is a regular file, in your work-tree.

There are also three copies of file2.txt and three copies of file3.txt.

Running git diff with no arguments compares all three files in HEAD to all three files in your work-tree. Only those that are different get mentioned in the output.

Running git diff --staged or git diff --cached compares all three files in HEAD to all three files in the staging area. Only those that are different get mentioned in the output.

Running git diff HEAD compares all three files in HEAD to all three files in your work-tree. Only those that are different get mentioned in the output.

Note that when you use git log -p or git show to view a commit, Git does a git diff of the parent commit's snapshot—its files—vs that commit's snapshot. Only those files that are different are mentioned in diff you see. So it looks like the commit stores changes, but actually, it just stores a snapshot.

Note, too, that git status runs two git diffs: one compares HEAD vs staging-area, i.e., does a git diff --staged, and mentions only the names of files, without showing diffs. These are the changes staged for commit files. The second diff compares index vs work-tree, i.e., does a git diff, and mentions only the name of files. These are the changes not staged for commit.


1The staging area can be totally empty, and is in a fresh repository with no files in it yet and none yet git add-ed. You can also git rm every file, which will cause the staging area to be empty. But normally, it's full of copies of the files from the HEAD commit, until you use git add to replace those files with ones from the work-tree.

2You can ask Git two questions about the special name HEAD:

git rev-parse HEAD

asks Git What hash ID does HEAD represent, i.e., what is the current commit? That's the one git diff asks. Or:

git symbolic-ref HEAD
git rev-parse --symbolic-full-name HEAD

asks Git What branch name does HEAD represent, i.e., what branch would git status say I'm on? That question gets asked by git commit, for instance, when it goes to update the branch name.

Upvotes: 3

Related Questions