Ignore any blank space or line break in git-diff

Kamafeather picture Kamafeather · Sep 23, 2014 · Viewed 11.5k times · Source

I have the same HTML file rendered in two different ways and want to compare it using git diff, taking care of ignoring every white-space, tab, line-break, carriage-return, or anything that is not strictly the source code of my files.

I'm actually trying this:

git diff --no-index --color --ignore-all-space <file1> <file2>

but when some html tags are collapsed all on one line (instead of one per line and tabulated) git-diff detect is as a difference (while for me it is not).

<html><head><title>TITLE</title><meta ......

is different from

<html>
    <head>
        <title>TITLE</title>
        <meta ......

What option do I miss to accomplish what I need and threat as if it was the same?

Answer

Landys picture Landys · Sep 23, 2014

git diff supports comparing files line by line or word by word, and also supports defining what makes a word. Here you can define every non-space character as a word to do the comparison. In this way, it will ignore all spaces including white-spcae, tab, line-break and carrige-return as what you need.

To achieve it, there's a perfect option --word-diff-regex, and just set it --word-diff-regex=[^[:space:]]. Refer to doc for detail.

git diff --no-index --word-diff-regex=[^[:space:]] <file1> <file2>

Here's an example. I created two files, with a.html as follows:

<html><head><title>TITLE</title><meta>

With b.html as follows:

<html>
    <head>
        <title>TI==TLE</title>
        <meta>

By running

git diff --no-index --word-diff-regex=[^[:space:]] a.html b.html

It highlights the difference of TITLE and TI{+==+}TLE in the two files in plain mode as follows. You can also specify --word-diff=<mode> to display results in different modes. The mode can be color, plain, porcelain and none, and with plain as default.

diff --git a/d.html b/a.html
index df38a78..306ed3e 100644
--- a/d.html
+++ b/a.html
@@ -1 +1,4 @@
<html>
    <head>
            <title>TI{+==+}TLE</title>
                    <meta>