Reputation: 1762
I am having a lot of trouble working with my Git history after a bulk rename applied to a large project (slightly under 10,000 files). I have changed the project layout by moving files from Project/src/....
to Project/src/main/java/....
. I have also modified some of the moved files in the same commit.
Let's take a look at one such file:
$ git logc PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
* commit 6e7a960f99b0e6164d2713a4cbca2107034d8bbd
Author: moffats <moffats@ec78347f-1a2b-0410-964c-a7254f1fcdc6>
Date: Thu Apr 23 21:24:30 2015 +0000
merge gradle branch to the trunk
Ok, looks like we need to use --follow
to tell Git to follow renames:
$ git logc --follow PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
...
* commit 6e7a960f99b0e6164d2713a4cbca2107034d8bbd
| Author: moffats <moffats@ec78347f-1a2b-0410-964c-a7254f1fcdc6>
| Date: Thu Apr 23 21:24:30 2015 +0000
|
| merge gradle branch to the trunk
|
...
* commit ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0
Author: perches <perches@ec78347f-1a2b-0410-964c-a7254f1fcdc6>
Date: Thu Mar 22 21:29:41 2012 +0000
Updates for JSF 2.0 upgrade
There we go. Now let's compare the two versions:
$ git diff -M -l0 ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0..HEAD PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
diff --git a/PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java b/PINSSUserPasswordUtil/src/main/java/
new file mode 100644
index 0000000..acc4d40
--- /dev/null
+++ b/PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
@@ -0,0 +1,186 @@
+package ca.gc.agr.pinss.userPasswordUtil;
+
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.sql.Connection;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.Map.Entry;
+import java.util.Properties;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
... etc ...
Git is clearly not detecting the rename. It's not finding the old location of the file, and telling me that the entire file content was added.
Note that it's comparing the existing file to /dev/null
, rather than the old file path, which should be:
PINSSUserPasswordUtil/src/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
Ok, let's examine what the commit where it was renamed looks like:
$ git show --stat=180 6e7a960f99b0e6164d2713a4cbca2107034d8bbd | grep UserPasswordReseter
PINSSUserPasswordUtil/src/{ => main/java}/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java | 10 +-
This looks good, so this is where I am stuck. The only way I can get the diff I need is by explicitly telling Git what the old file path used to be:
$ git diff ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0:PINSSUserPasswordUtil/src/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java HEAD:PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
-- a/ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0:PINSSUserPasswordUtil/src/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
+++ b/HEAD:PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
@@ -29,18 +29,18 @@ public class UserPasswordReseter {
private static final Log log = LogFactory.getLog(UserPasswordReseter.class);
private static final String DATASOURCE_BEAN_NAME = "dataSource";
- private static final String USER_PROPERTIES = "configuration/users.properties";
+ private static final String USER_PROPERTIES = "/users.properties";
And now I get what I want.
So why did my previous 'git diff -M -l0
' command not work? I think this problem also causes such tools as EGit in Eclipse and 'git instaweb
' not to work anymore, which means that I just lost the ability to access pre-rename history with powerful GUI tools.
I am not sure how I can fix it at this point. Any suggestions would be appreciated.
git version 1.9.1
EDIT: As pointed out in the comments, this command:
git diff -M -l0 ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0..HEAD PINSSUserPasswordUtil/src/main/java/ca/gc/agr/pinss/userPasswordUtil/UserPasswordReseter.java
didn't work, b/c we need to run the diff on the whole repository in order for rename detection to work. So a command like this works correctly:
$ git diff -M -l0 ce0c98d4b78e2f006ead16a030b3c5f0d7ec3ac0 HEAD
Of course, now I need to look for the file I am interested in inside the big diff output for the whole commit, but it does save me the trouble of having to type the old and the new paths. Still far from ideal, but better than before.
Upvotes: 1
Views: 247
Reputation: 489638
Rename detection is computationally expensive (see next paragraph), so git limits it to whatever you configure. If you don't configure a particular value, git has built in defaults that have increased over the various releases (was 100, then 200, now 400).
In particular, when comparing any two revisions (or more precisely, two trees), path-names that appear in the "old" tree and not the "new" tree provide source candidate files for renames, and path-names that appear in the "new" tree but not the "old" tree provide destination candidates for renames. To detect an actual rename, git must compare every possible source against every possible destination.
As far as I know, git does not (currently) have a "renamed directory, tail part of file names are the same" heuristic to reduce the size of the list. If it did that would help a whole lot here. Without it, you can try setting the rename limit very high (to as many path names as needed), using the -l
option to git diff
or the diff.renameLimit
configuration variable. Setting it to an explicit 0
means "unlimited" (which is actually 32767 internally, hence not quite unlimited).
Upvotes: 2