Unicode (hexadecimal) character literals in MySQL

ChrisV picture ChrisV · Nov 23, 2010 · Viewed 7.4k times · Source

Is there a way to specify Unicode character literals in MySQL?

I want to replace a Unicode character with an Ascii character, something like the following:

Update MyTbl Set MyFld = Replace(MyFld, "ẏ", "y")

But I'm using even more obscure characters which are not available in most fonts, so I want to be able to use Unicode character literals, something like

Update MyTbl Set MyFld = Replace(MyFld, "\u1e8f", "y")

This SQL statement is being invoked from a PHP script - the first form is not only unreadable, but it doesn't actually work!

Answer

Pacerier picture Pacerier · Jan 26, 2015

You can specify hexadecimal literals (or even binary literals) using 0x, x'', or X'':

select  0xC2A2;
select x'C2A2';
select X'C2A2';

But be aware that the return type is a binary string, so each and every byte is considered a character. You can verify this with char_length:

select char_length(0xC2A2)

2

If you want UTF-8 strings instead, you need to use convert:

select convert(0xC2A2 using utf8mb4)

And we can see that C2 A2 is considered 1 character in UTF-8:

select char_length(convert(0xC2A2 using utf8mb4))

1


Also, you don't have to worry about invalid bytes because convert will remove them automatically:

select char_length(convert(0xC1A2 using utf8mb4))

0

As can be seen, the output is 0 because C1 A2 is an invalid UTF-8 byte sequence.