Unicode characters from charcode in javascript for charcodes > 0xFFFF

javascript unicode astral-plane

leemes · Mar 27, 2011 · Viewed 14k times · Source

I need to get a string / char from a unicode charcode and finally put it into a DOM TextNode to add into an HTML page using client side JavaScript.

Currently, I am doing:

String.fromCharCode(parseInt(charcode, 16));

where charcode is a hex string containing the charcode, e.g. "1D400". The unicode character which should be returned is 𝐀, but a 퐀 is returned! Characters in the 16 bit range (0000 ... FFFF) are returned as expected.

Any explanation and / or proposals for correction?

Thanks in advance!

Answer

String.fromCharCode can only handle code points in the BMP (i.e. up to U+FFFF). To handle higher code points, this function from Mozilla Developer Network may be used to return the surrogate pair representation:

function fixedFromCharCode (codePt) {
    if (codePt > 0xFFFF) {
        codePt -= 0x10000;
        return String.fromCharCode(0xD800 + (codePt >> 10), 0xDC00 + (codePt & 0x3FF));
    } else {
        return String.fromCharCode(codePt);
    }
}

Unicode characters from charcode in javascript for charcodes > 0xFFFF

Answer

Related questions