How to set locale in the current terminal's session?

Timur Fayzrakhmanov picture Timur Fayzrakhmanov · May 13, 2017 · Viewed 24.7k times · Source

I'm trying to change encoding in the urxvt current session by changing LANG variable. Howerever, it seems like it doesn't apply immediately. Here is what I do:

Available locales:

$ locale -a
C
en_US.utf8
POSIX
ru_RU.koi8r
ru_RU.utf8

Before setting new locale:

$ echo "а" | od -t x1
0000000 d0 b0 0a # good! UTF-8
#       | a ||NL|

After:

$ export LANG=ru_RU.KOI8-R
$ echo "а" | od -t x1
0000000 d0 b0 0a # hm..expect 'c1 0a'

Fork new urxvt instance by running $ urxvt & and finally get what I want:

$ echo "а" | od -t x1
0000000 c1 0a

Why doesn't LANG change the behavior in the first place?

Answer

Thomas Dickey picture Thomas Dickey · May 13, 2017

There are two factors:

  • you may be using a shell with a built-in echo (and have not informed the shell that you are changing the locale)
  • LANG is not the first environment variable checked. According to locale(7), LC_ALL and LC_CTYPE would be checked first:
       If the second argument to setlocale(3) is an empty string, "", for
       the default locale, it is determined using the following steps:

       1.     If there is a non-null environment variable LC_ALL, the value
              of LC_ALL is used.

       2.     If an environment variable with the same name as one of the
              categories above exists and is non-null, its value is used for
              that category.

       3.     If there is a non-null environment variable LANG, the value of
              LANG is used.

For the latter, look at the output from the locale command, which lists all of the environment variables which would be used:

$ export LANG=ru_RU.KOI8-R
$ locale
LANG=ru_RU.KOI8-R
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

Just changing LANG should not change the other variables, but changing LC_ALL generally does that.

$ export LC_ALL=ru_RU.KOI8-R
$ locale
LANG=ru_RU.KOI8-R
LANGUAGE=
LC_CTYPE="ru_RU.KOI8-R"
LC_NUMERIC="ru_RU.KOI8-R"
LC_TIME="ru_RU.KOI8-R"
LC_COLLATE="ru_RU.KOI8-R"
LC_MONETARY="ru_RU.KOI8-R"
LC_MESSAGES="ru_RU.KOI8-R"
LC_PAPER="ru_RU.KOI8-R"
LC_NAME="ru_RU.KOI8-R"
LC_ADDRESS="ru_RU.KOI8-R"
LC_TELEPHONE="ru_RU.KOI8-R"
LC_MEASUREMENT="ru_RU.KOI8-R"
LC_IDENTIFICATION="ru_RU.KOI8-R"
LC_ALL=ru_RU.KOI8-R