Unicode (utf-8) with git-bash

Hannes picture Hannes · May 18, 2012 · Viewed 61.9k times · Source

I'm having some trouble getting unicode to work for git-bash (on windows 7). I have tried many things without success. Although, I'm not quite sure what is responsible to for this so i might be working in the wrong direction.

It really seems this should be possible as the encoding for cmd.exe can be changed to unicode with 'chcp 65001'.

Here are some things I've tried (besides the obvious of looking through the configuration options in the GUI).

  1. Setting environment variables in '.bashrc'. I guess it makes sense this doesn't work since i think it's a linux thing. The 'locale' command does not exist.

    export LC_ALL=en_US.UTF-8
    export LANG=en_US.UTF-8
    export LANGUAGE=en_US.UTF-8
    
  2. Starting out in cmd.exe, changing the encoding to unicode with 'chcp 65001' and then starting up git-bash. This causes me to get a permission denied when trying to cat my unicode test file. However, catting a file without unicode works just fine. As demonstrated, dropping back out to cmd.exe i can still "cat" the file. Using my default encoding (437) i can cat the file in bash (no permission denied but the output is fudged).

    S:\>chcp 65001
    Active code page: 65001
    S:\>"C:\Program Files (x86)\Git\bin\sh.exe" --login -i
    zarac@TOWELIE /z
    cat /s/unicode.txt
    cat: write error: Permission denied
    zarac@TOWELIE /z
    cat /s/nounicode.txt
    abc
    zarac@TOWELIE /z
    L /s/unicode.txt
    -rw-r--r--    1 zarac    Administ        7 May 18 10:30 /s/unicode.txt
    zarac@TOWELIE /z
    whoami
    towelie\zarac
    zarac@TOWELIE /z
    exit
    Z:\>type S:\unicode.txt
    abc£
    
  3. Using the /U flag when starting the shell (makes sense that it doesn't work because it's not quite what it's for if-i-understand-correctly, but it has to do with unicode so i tried it).

    C:\Windows\SysWOW64\cmd.exe /U /C "C:\Program Files (x86)\Git\bin\sh.exe" --login -i
    
  4. As I prefer to use Console2, I've tried adding a dword value named CodePage with the value 65001 (decimal) to the windows registry under [HKEY_CURRENT_USER\Console] as well as [HKEY_CURRENT_USER\Console\Git Bash]. This seems to have the same effect as setting 'chcp 65001' accept that it's "automatic". (http://stackoverflow.com/questions/379240/is-there-a-windows-command-shell-that-will-display-unicode-characters)

  5. JPSoft's TCC/LE

  6. PowerCMD

  7. stackoverflow

  8. duckduckgo

  9. ixquick / google

So, method 2 seems viable if that permission issue can be fixed. However, I'm open to pretty much any solution although i prefer if i can use Console2 (due mostly to it's nifty tab feature). Perhaps one solution would be to setup an SSH server and then use Putty/Kitty to connect to it, but that's just wrong! ; )

PS. Is there any official documentation for git-bash?

Answer

nkatsar picture nkatsar · Apr 18, 2016

I faced the same issue in MSYS Git 2.8.0 and as it turned out it just needed changing the configuration.

$ git --version

git version 2.8.0.windows.1

The default configuration of Git Bash console in my system did not show Greek filenames.

$cd ~

$ls

AppData/
'Application Data'@
Contacts/
Cookies@
Desktop/
Documents/
Downloads/
Favorites/
Links/
'Local Settings'@
NTUSER.DAT
.
.
.
''$'\316\244\316\261'' '$'\316\255\316\263\316\263\317\201\316\261\317\206\316\254'' '$'\316\274\316\277\317\205'@

The last line should display "Τα έγγραφά μου", the greek translation of "My Documents". In order to fix it I followed the below steps:

  1. Check your existing locale configuration

    $locale
    
    LANG=en
    LC_CTYPE="C"
    LC_NUMERIC="C"
    LC_TIME="C"
    LC_COLLATE="C"
    LC_MONETARY="C"
    LC_MESSAGES="C"
    LC_ALL=
    

    As shown above, in my case it was not UTF-8

  2. Change the locale to a UTF-8 encoding. Click the icon on the left side of MINGW title bar, select "Options" and in the "Text" category choose "UTF-8" Character set. You should also choose a unicode font, such as the default "Lucida Console". My configuration looks as following: MinGW locale configuration

  3. Change the language for the current window (no need to do this on future windows, as they will be created with the settings of step 2)

     $ LANG='C.UTF-8'
    
  4. The ls command should now display properly

    AppData/
    'Application Data'@
    Contacts/
    Cookies@
    Desktop/
    Documents/
    Downloads/
    Favorites/
    Links/
    'Local Settings'@
    NTUSER.DAT
    .
    .
    .
    'Τα έγγραφά μου'@