Reputation: 3184
I'm using Git to version prose and have been trying git diff --word-diff
to see changes within lines. I want to use the results generated in a script.
But the default way that --word-diff
identifies a word seems flawed. So I've been experimenting with --word-diff-regex=
options.
Here are the two main flaws I'm trying to deal with:
Added whitespace seems to be ignored. But whitespace can be quite important if trying to use the results programmatically.
For example, take this header from a Markdown (.md) file:
# Test file
Now, let's add some text to the end of it:
# Test file in Markdown
If I run git diff --word-diff
on this:
# Test file {+in Markdown+}
But the space before the word "in" has not been included as part of the diff.
Empty lines are completely ignored.
Here's a standard git diff
for the content of a file where I've removed a line and also added a couple of new lines -- one empty, the other with the text "Here's a new line."
This is a test file to see how word diff responds in certain situations.
-
I'll try removing lines and adding them to see what happens.
Here's another line so we can see what happens with line removals and additions. I want to see how `git diff --word-diff` handles it all!
+
+Here's a new line.
But here's git diff --word-diff
for the same content:
This is a test file to see how word diff responds in certain situations.
I'll try removing lines and adding them to see what happens.
Here's another line so we can see what happens with line removals and additions. I want to see how `git diff --word-diff` handles it all!
{+Here's a new line.+}
The removed and added empty lines are completely ignored.
Putting the two examples above together. Here's what I'd like to see:
# Test file{+ in Markdown+}
This is a test file to see how word diff responds in certain situations.
{--}
I'll try removing lines and adding them to see what happens.
Here's another line so we can see what happens with line removals and additions. I want to see how `git diff --word-diff` handles it all!
{++}
{+Here's a new line.+}
git diff --word-diff-regex='.'
seems too granular for when new words share characters with existing wordsgit diff --word-diff-regex='[^ ]+|[ ]'
seems to solve the first problem but, to be honest, I'm not actually sure why.git diff --word-diff-regex='[^ ]+|[ ]|^$'
-- I was hoping the ^$
on the end would help capture empty lines -- but it doesn't and, worse, it then seems to ignore the change that follows.git diff --word-diff-regex='[^ ]+|[ ]|.{0}'
creates same problem as the one before.I'd be grateful if anyone could shed any light on how to do this, or at least share some knowledge on what's going on under the hood with --word-diff-regex
.
Upvotes: 6
Views: 726
Reputation: 124257
The main thing that you're running into that's stopping you from having what you want, from https://git-scm.com/docs/diff-options, is:
A match that contains a newline is silently truncated(!) at the newline.
This is going to mean that word diffs are always going to ignore line diffs. I don't think you're going to achieve the results you want short of a custom diff generator.
Upvotes: 2