I was using CHAR(code_point USING ucs2) to convert a unicode code point to utf-8 character but it's giving me unexpected results above 0x00ff code point. It gives me the the character Ā (code point 0x0100) against code points 0x0100 to 0x01FF, and character Ȁ (code point 0x0200) for code points 0x0200 to 0x02FF, and so on.
So if I execute this query:
SET NAMES utf8;
SELECT CHAR(0x0100 USING ucs2),CHAR(0x0101 USING ucs2),CHAR(0x0200 USING ucs2),CHAR(0x0201 USING ucs2);
, it gives me the result:
| Ā | Ā | Ȁ | Ȁ |
whereas the expected result is:
| Ā | ā | Ȁ | ȁ |
Please help me understanding the problem, or suggest another way of doing this.
Thanks in advance..
I got it working by doing this
CONVERT(CHAR(code_point) USING ucs2);
I have to mix the characters with utf8, so I have to further convert into utf8
CONVERT(CONVERT(CHAR(code_point) USING ucs2) USING utf8);