How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?

Docstero picture Docstero · May 29, 2010 · Viewed 138.3k times · Source

Is there a function in PHP that can decode Unicode escape sequences like "\u00ed" to "í" and all other similar occurrences?

I found similar question here but is doesn't seem to work.

Answer

Gumbo picture Gumbo · May 29, 2010

Try this:

$str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}, $str);

In case it's UTF-16 based C/C++/Java/Json-style:

$str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UTF-16BE');
}, $str);