John F. Miller
John F. Miller

Reputation: 27217

Using an alternate diff algorithm in Git

Because git is designed for source code, its default diff algorithm treats a line as the minimum indivisible unit.

I am trying to edit some markdown files that are word wrapped at column 80. Adding a sentence can cause the rest of the paragraph to be marked as changed.

Is there a way to have Git use a diff algorithm more suited to text? I need one that treats words or sentences as indivisible units rather than lines?

Upvotes: 23

Views: 4193

Answers (3)

Casebash
Casebash

Reputation: 118792

Here is an example of customising this (from this question). As a default, --word-diff assumes a word to be a string of non-whitespace characters. The following command will consider a word consist of one of the following:

  1. A string of alpha-numeric characters and underscores
  2. A single non-character

The command:

git diff --color-words --word-diff-regex='[A-z0-9_]+|[^[:space:]]'

Upvotes: 7

manojlds
manojlds

Reputation: 301147

Maybe you are looking for word-diff

--word-diff[=<mode>]

Show a word diff, using the <mode> to delimit changed words. By default, words are delimited by whitespace; see --word-diff-regex below. The <mode> defaults to plain, and must be one of:

color

Highlight changed words using only colors. Implies --color.

plain

Show words as [-removed-] and {added}. Makes no attempts to escape the delimiters if they appear in the input, so the output may be ambiguous.

porcelain

Use a special line-based format intended for script consumption. Added/removed/unchanged runs are printed in the usual unified diff format, starting with a +/-/ character at the beginning of the line and extending to the end of the line. Newlines in the input are represented by a tilde ~ on a line of its own.

none

Disable word diff again.

Note that despite the name of the first mode, color is used to highlight the changed parts in all modes if enabled.

http://git-scm.com/docs/git-diff

Upvotes: 10

Matthew Ratzloff
Matthew Ratzloff

Reputation: 4623

You might try git diff --word-diff instead.

$ git diff --word-diff
diff --git a/test.txt b/test.txt
index 54585bb..a8cd97e 100644
--- a/test.txt
+++ b/test.txt
@@ -1,7 +1,7 @@
Because git is designed for source code, its diff algorithms {+are bibbity +}
{+bobbity boo+} treat a line as the minimum indivisible unit. I am trying to edit 
some markdown files that are word wrapped at column 80. Adding a sentence can 
cause the rest of the paragraph to be marked as changed.

Is there a way to have Git use a diff algorithm more suited to text? One that 
treats words or sentences as indivisible units rather then lines?
 No newline at end of file

Upvotes: 23

Related Questions