PHP/Gettext Problems

Alix Axel picture Alix Axel · Aug 3, 2010 · Viewed 12.7k times · Source

I remember running some tests a some months ago with gettext and the following code worked perfectly:

putenv('LANG=l33t');
putenv('LANGUAGE=l33t');
putenv('LC_MESSAGES=l33t');

if (defined('LC_MESSAGES')) // available if PHP was compiled with libintl
{
    setlocale(LC_MESSAGES, 'l33t');
}

else
{
    setlocale(LC_ALL, 'l33t');
}

bindtextdomain('default', './locale'); // ./locale/l33t/LC_MESSAGES/default.mo
bind_textdomain_codeset('default', 'UTF-8');
textdomain('default');

echo _('Hello World!'); // h3110 w0r1d!

This worked perfectly (under Windows XP and CentOS if I remember correctly), which was good because I could use arbitrary "locales", without having to bother if they were installed on the system or not. However, this doesn't seem to work anymore, I wonder why...


Red Hat + PHP 5.2.11:

I'm able to switch back and forth from various locales and the translations show up correclty as long as the setlocale() call doesn't return false (if the locale is available/installed on the system).

This is not perfect (would be great if I could just point gettext to any arbitrary translation directory without having to test for the existence of the locale), but it's acceptable. I'll run some more tests later on.

Windows 7 + PHP 5.3.1 (XAMPP):

setlocale() always returns false (even when using LC_ALL instead of LC_MESSAGES), unless I use some valid Windows locale such as eng, deu or ptg - in this case the locale seems to be correctly set but the translations still don't show up. I can't test right now because I've hundreds of tabs open but I think the very first call to that script yields the correct translation (restarting Apache won't do the trick).

I'm not sure if this is related to the PHP Bug #49349. I'll test this is a couple of hours.


Is there any way to use the gettext extension (not pure PHP implementations like php-gettext or the Zend Translate Adapter) reliably across different operating systems (possibly with custom locales like l33t)?

Also, is it absolutely necessary to use setlocale(LC_ALL, ...)? I would preffer leaving the TIME, NUMERIC and MONETARY (specially) locale settings untouched (defaulting to the POSIX locale).


I had an idea... Would it be possible to call setlocale() with a very common locale (like C, POSIX or en_US) and specify the language via the domain? Something like this:

/lang/C/LC_MESSAGES/domain.pt.mo
/lang/C/LC_MESSAGES/domain.de.mo
/lang/C/LC_MESSAGES/domain.en.mo
/lang/C/LC_MESSAGES/domain2.pt.mo
/lang/C/LC_MESSAGES/domain2.de.mo
/lang/C/LC_MESSAGES/domain2.en.mo

Would this work on *nix and Windows plataforms without problems?

Answer

mario picture mario · Aug 21, 2010

Gettext isn't overly practical for webapps.

  • As for example it doesn't honor/use Accept-Language style preferences by itself.
  • Typically incurs some caching issues on shared webhosts (mod_php SAPI).

So I sort of sometimes wish that PHP module wouldn't exist, and the convenient _() function name shortcut was available to userland implementations.
(Had my own gettext.php, which worked more reliable.)

Your options:

  1. Anway, according to a few bug reports the Windows port of gettext had some flaws with UTF-8. Maybe your version is affected again. So try bind_textdomain_codeset('default', 'ISO-8859-1'); for starters. Also, it seems to prefer the environment variables on Windows IIRC, so putenv("LC_ALL", "fr_FR"); might work better than setlocale(). Especially workable if you dl(gettext.dll) later on.

    Also give it a chance with including a charset right there LANG=en_GB.ISO-8859-1. (Since your source text is English anyway, caring about the charset isn't very relavant here; but probably a common case where gettext trips over itself.) Oh and on occasion it's UTF8 not UTF-8; also try ASCII.

  2. Alternatively circumvent gettext. Your domain idea is close, but I'd simply use a pre-defined ./locale/ subdir for languages:

    ./lang/en/locale/C/LC_MESSAGES/domain.mo
    

    Then just invoke bindtextdomain("default", "./lang/{$APP_LANG}/locale") without giving gettext room to interpret much. It will always look up /C/, but the correct locale directory has been injected already. But try to have a symlink from the $LANG to /C/ in there anyway.

  3. Bite in the gnu. Give up on gettext. "PhpWiki" had a custom awk conversion script. It transforms .po files into .php array scripts (yeah very oldschool), and just utilizes a __() function instead. Close. And more reliable.