Git messed up my files, showing chinese characters in some places

laggingreflex picture laggingreflex · Jul 7, 2013 · Viewed 8.9k times · Source

disclaimer: By Git, I mean 'I' messed up.

Earlier, I wanted git-gui to show me the diff for which it thinks are binary files.

So I made some changes to my .\.gitattributes

*.ini       text
*.inc       text

But it didn't work. Then I made some changes to my .\.git\info\attributes

*.ini       text
*.inc       text
*.inc crlf diff
*.ini crlf diff

and it worked.

But now when I go back to previous commits it messes up...

chinese characters This is how it should look: english characters

It doesn't happen in all the files. EDIT: It happens only in files that have any special characters in them.

Q: Is it the issue with the commits itself or just some setting?
Q: Can I recover?

Answer

bobince picture bobince · Jul 8, 2013

Your ini files are saved in UTF-16LE, the encoding that Windows misleadingly describes as ‘Unicode’.

Git's default diffing tools don't work on UTF-16, because it's not an ASCII-compatible encoding. This is why git detected the files as binary originally.

LF/CRLF newline conversion is seeing each 0x0A byte as being a newline, and replacing it with 0x0D-0x0A. But, in a UTF-16LE file, a newline is actually signalled by 0x0A-0x00, and replacing that with 0x0D-0x0A-0x00 means that you've got an odd number of bytes, so the alignment of each two-byte code unit in the next line is out of sync. Consequently every other line gets mangled.

Your options are:

  1. Revert the attribute change and let Git handle the files as binary (losing the benefit of diffs).

  2. Save the files in an ASCII-compatible encoding. It looks like your content doesn't actually have any non-ASCII characters in, so hopefully that's not a problem? Normally you would want to save all your files as UTF-8 - this is ASCII-compatible but also allows all Unicode characters to be used. But that depends on whether Rainmeter supports reading INI files encoded like that (probably not).

  3. Configure git to use a different diff tool, though this will make it more complicated for others to work with your repo.