How to write unicode cross symbol in Java?

Dog picture Dog · May 17, 2013 · Viewed 47.1k times · Source

I'm trying to write this unicode cross symbol (𐀵) in Java:

class A {
    public static void main(String[] args) {
        System.out.println("\u2300");
        System.out.println("\u10035");
    }
}

I can write o with a line through it () just fine, but the cross symbol doesn't show up, instead it just prints the number 5:

# javac A.java && java A
⌀
ဃ5

Why?

Answer

Jon Skeet picture Jon Skeet · May 17, 2013

You're looking for U+10035, which is outside the Basic Multilingual Plane. That means you can't use \u to specify the value, as that only deals with U+0000 to U+FFFF - there are always exactly four hex digits after \u. So currently you've got U+1003 ("MYANMAR LETTER GHA") followed by '5'.

Unfortunately Java doesn't provide a string literal form which makes characters outside the BMP simple to express. The only way of including it in a literal (but still in ASCII) is to use the UTF-16 surrogate pair form:

String cross = "\ud800\udc35";

Alternatively, you could use the 32-bit code point form as an int:

String cross = new String(new int[] { 0x10035 }, 0, 1);

(These two strings are equal.)

Having said all that, your console would still need to support that character - you'll need to try it to find out whether or not it does.