HarMala
HarMala

Reputation: 29

git diff - output showing changes "incorrectly"

Let's say I get the diff output of comparing 2 files:

example
example
example
example
example
example
example
example

and

example
example#
example
example
example
example#
example#
example

So basically, the only difference I made to the original file was adding #-marks to some of the lines. For these 2 files, the diff output would be:

...
example
+example#
example
example
-example
-example
-example
+example#
+example#
example
...

So the diff command basically thinks that the first #-mark that I put on the second line is a completely new line in the file. Is there any way to make the diff output the changes like this:

...
example
-example
+example#
example
example
-example
-example
+example#
+example#
example
...

This would make my life easier. Thanks!

Upvotes: 1

Views: 1381

Answers (2)

Sergio Lema
Sergio Lema

Reputation: 1629

The case you show up contains all lines with the same content, the git algorithm won't be able to distinguish one line to another. If the lines were different, it will show you up which line changed (with additions or removals). Then, to go further, you can use git diff --word-diff (https://git-scm.com/docs/git-diff#git-diff---word-diffltmodegt) to show you the differences per character, not per line.

Upvotes: 1

Mark Adelsberger
Mark Adelsberger

Reputation: 45819

I mean, you can try specifying different algorithms and see if, in a given case, one of them gives you a result you like better. See the git diff docs (https://git-scm.com/docs/git-diff); there's an --algorithm option where you can pick patience, minimal, histogram, or myers.

Is any of them going to do what you want in this case? According to my tests, no; but then, I assume this may be an exaggerated example case, so maybe one of them would help in your real scenario. I'm not aware of a good "practical" explanation of when each is best or how their output differs; but now that you have the names of the algorithms, I suppose you can decide if it's worth trying to research all that.

I'd say there's more to generating a diff than seems obvious. Often there are multiple patches that will get you from point A to point B, and it can be open to interpretation which is "better". In some cases it's possible for special-purpose diff tools to use awareness of language structure to be a little smarter; but what you're showing is a highly repetitive file with little to indicate structure, so I don't even think that kind of thinking would necessarily help here.

Upvotes: 0

Related Questions