ravron
ravron

Reputation: 11221

git diff-files output changes after git status

I have a script, update.py, that downloads new versions of files tracked in my git repository:

$ python update.py
Doing work...
Done
$ git status
On branch my-branch
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   foo.txt
    modified:   bar.txt
    modified:   baz.txt

no changes added to commit (use "git add" and/or "git commit -a")

Sometimes, the files that are downloaded are identical to the files already in HEAD, so after the download the working directory is clean:

$ python update.py
Doing work...
Done
$ git status
On branch my-branch
nothing to commit, working directory clean

However, I've found that git diff-files appears to get confused when the files are replaced, even though their contents are identical:

$ python update.py
Doing work...
Done
$ git diff-files
:100644 100644 ffa91f655007c56f209cf15fee13c55991a76e18 0000000000000000000000000000000000000000 M  foo.txt
:100644 100644 dc05558729c3c94a088aa63da3bbd8f1213b8cf3 0000000000000000000000000000000000000000 M  bar.txt
:100644 100644 002cc3f53dc64b89b1b91adbb6fe61035ba9e832 0000000000000000000000000000000000000000 M  baz.txt
$ git status
On branch my-branch
nothing to commit, working directory clean
$ git diff-files
$

In the snippet above:

  1. I run update.py, which replaces the foo.txt, bar.txt, and baz.txt files with identical copies downloaded from elsewhere.
  2. git diff-files incorrectly reports that those three files have been edited in-place in the work tree, according to the raw output format described on the git diff man page.
  3. git status correctly reports that nothing has changed.
  4. git diff-files, run after git status, now also reports that nothing has changed.

After running update.py, git diff-files will continue to incorrectly report changes until I run git status, after which point it behaves again.

What is going on here? Why is git diff-files reporting changes when there are none?


In case you're curious why this is causing me trouble, here's some more context:

I have another script, update_and_commit_if_needed.py that does the following:

  1. Run update.py.
  2. If git diff-files returns zero, the working tree is clean, and update.py didn't change anything. Exit.
  3. Otherwise, the working tree is dirty. Commit the changes.

I was seeing a weird failure in update_and_commit_if_needed.py: I'd get to step three, but then git commit would complain that there was nothing to commit, working directory clean. In tracking down that bug, I discovered this odd behavior of git diff-files.

I am using git version 2.5.0 on OS X 10.11.4 (15E65).


EDIT 1: I've found an easy way to reproduce this behavior:

$ git diff-files
$ git status
On branch my-branch
nothing to commit, working directory clean
$ cp foo.txt ~
$ mv ~/foo.txt .
$ git diff-files
:100755 100755 20084b5d6da359748f62c259c24f2b9cc2359780 0000000000000000000000000000000000000000 M  foo.txt
$ git status
On branch my-branch
nothing to commit, working directory clean
$ git diff-files
$

EDIT 2: As suggested in a comment, I've tried inverting core.trustctime and core.ignoreStat from their defaults. This does not appear to change git's behavior in this case.

Upvotes: 11

Views: 1305

Answers (1)

ravron
ravron

Reputation: 11221

git diff-index does not actually check the contents of files in the working tree. Instead, it uses the stat information of the file and compares it to the index. In fact, the diff-index man page notes:

As with other commands of this type, git diff-index does not actually look at the contents of the file at all. So maybe kernel/sched.c hasn’t actually changed, and it’s just that you touched it. In either case, it’s a note that you need to git update-index it to make the index be in sync.

As the note suggests, the index's stat entry can be updated by running git update-index --refresh before diff-files. The man page for update-index elaborates:

--refresh does not calculate a new sha1 file or bring the index up-to-date for mode/content changes. But what it does do is to "re-match" the stat information of a file with the index, so that you can refresh the index for a file that hasn’t been changed but where the stat entry is out of date.

For example, you’d want to do this after doing a git read-tree, to link up the stat index details with the proper files.

Running update-index --refresh before diff-files erases the symptoms I described, solving the issue.

Upvotes: 14

Related Questions