Reputation: 33900
In fact I did not rename it, I deleted hash.c
with linux rm
and then copied a newer version of my hash table implementation named hashdic.c
with linux cp
from another directory. The deleted file and the new file are very similar, but not the same because I worked on hashdic.c
in another dir for a few hours.
Then I typed git rm hash.c
(although it was already deleted from file system to delete from the repository as well), and then typed git add hashdic.c
.
Then git commit -am "update to hash table"
. And magic! Git says:
renamed: hash.h -> hashdic.h
But, Holmes, how? How does git know I in fact renamed the file if technically I just deleted it and added a new one under a DIFFERENT name?
Whole process:
~/project/hash.c
to ~/other/project/hashdic.c
rm hash.c
cp ~/other/project/hashdic.c ~/project/hashdic.c
git rm hash.c
git commit -am descr
Upvotes: 3
Views: 409
Reputation: 488103
Try this:
$ git diff --name-status -M HEAD^ HEAD
You should see that between the two commits, the file was renamed and has a "similarity index" of (say) 95:
R095 hash.c hashdic.c
(I typed this in based on your posting—one line calls both files .h
, others call it .c
, I went with .c
here; anyway, it's not cut-and-pasted so there might be some minor glitches—and I made up the similarity index value. But the output should be similar enough to recognize, anyway, and I'm counting on the similarity index being below 100%. It's clearly at least 50% since that is the default.)
This shows that between the previous and current commits, the file was renamed and modified a bit.
Once you've done that, try this:
$ git diff --name-status -M100% HEAD^ HEAD
This time, you should see that hash.c
was deleted and hashdic.c
was added:
D hash.c
A hashdic.c
This shows that the change between the previous and current commit has no renames, only a deleted file and an added one.
Which is it? It's both: it's a floor wax and a dessert topping!
The fact is, git computes the change between commits (or commit and index or work directory, or any such pairing) dynamically, each time you ask for it, whether you run an explicit git diff
, or you run git status
(or git commit
runs it for you). You can specify whether rename detection is allowed at all (--no-renames
1) and if so, at what similarity threshold (-M
).
You can also ask for copy detection (-C
and --find-copies-harder
). There are some limits on how many "tree names" to apply this to, as it can get very expensive, computationally speaking, to compare every file in one commit against every file in another. By default, git limits you to rename detection, which is a bit easier since git only does this for "file names that were in the start commit but not in the destination, vs file names that were in the destination commit but not in the start".
(In this case, that's hash.c
and hashdic.c
respectively, unless you deleted and/or added additional paths. So git only has to diff these two files against each other, not against any additional files, to get a single similarity index and compare that to the -M
setting.)
1Most of these control knobs are only available in git diff
: git status
hardwires rename detection to "on" and 50%, for instance. The number of file names put into the rename detect queue is controlled by a git config
setting, diff.renameLimit
. Other git commands, such as git blame
, run git's internal diff engine with user-settable controls, but not all of them have the same meaning as in git diff
. For instance, git blame
looks at only one file, rather than entire directories, so its -C
and -M
are entirely different.
Upvotes: 3