Tom Ellis
Tom Ellis

Reputation: 9434

How can I make git diff make clearer patches

If I have this file

def main():
    foo
    bar
    baz
    quux
    corge

and I change it to

def other():
    foo
    bar
    baz

def main():
    other()
    quux
    corge

I really want to see the diff

+def other():
+    foo
+    bar
+    baz
+
 def main():
+    other()
-    foo
-    bar
-    baz
     quux
     corge

but git diff gives me

-def main():
+def other():
     foo
     bar
     baz
+
+def main():
+    other()
     quux
     corge

(with every diff algorithm it offers, patience, minimal, histogram and myers). Is there some way to persuade it to generate semantically clearer diffs?

Upvotes: 0

Views: 272

Answers (1)

torek
torek

Reputation: 489908

What you need is a difference engine that understands the semantics of the language involved. These are rare. Although it's been closed as off topic, see syntax aware diff tools? Typical diffs split at line or word boundaries, and then find a minimum edit distance, which is the shortest set of operations that will change the A version into the B version. It does not have to make any sense semantically, it just has to be short. Git's built-in algorithms search for "short" (not always shortest, as the default myers algorithm uses heuristics to produce a result faster) but not "semantically sensible". The built-in alternative algorithms are:

  • minimal: same as myers but avoids the shortcuts to produce truly minimal edit distances.
  • patience: matches only on unique lines when building boxes for myers-style diff. For programming languages with repeated brace or then/else/endif lines that produce meaningless matches, this can give better results.
  • histogram: instead of discarding repeated lines in each box, adds weighting to the matches based on the frequency of the occurrence of the lines. The more-frequent lines get lower weighting. In theory, this should produce the best results (given the basic issues with non-"syntax-aware" diff).

Obviously, you've tried these and none are as good as a syntax-aware diff.

Assuming you have obtained or written a better diff, you can then plug it into Git as an external diff utility. See How do I view 'git diff' output with a visual diff program? and Why doesn't `git diff` invoke external diff tool?

Upvotes: 1

Related Questions