What are "connecting characters" in Java identifiers?

LuckyLuke picture LuckyLuke · Aug 2, 2012 · Viewed 66.6k times · Source

I am reading for SCJP and I have a question regarding this line:

Identifiers must start with a letter, a currency character ($), or a connecting character such as the underscore ( _ ). Identifiers cannot start with a number!

It states that a valid identifier name can start with a connecting character such as underscore. I thought underscores were the only valid option? What other connecting characters are there?

Answer

Peter Lawrey picture Peter Lawrey · Aug 2, 2012

Here is a list of connecting characters. These are characters used to connect words.

http://www.fileformat.info/info/unicode/category/Pc/list.htm

U+005F _ LOW LINE
U+203F ‿ UNDERTIE
U+2040 ⁀ CHARACTER TIE
U+2054 ⁔ INVERTED UNDERTIE
U+FE33 ︳ PRESENTATION FORM FOR VERTICAL LOW LINE
U+FE34 ︴ PRESENTATION FORM FOR VERTICAL WAVY LOW LINE
U+FE4D ﹍ DASHED LOW LINE
U+FE4E ﹎ CENTRELINE LOW LINE
U+FE4F ﹏ WAVY LOW LINE
U+FF3F _ FULLWIDTH LOW LINE

This compiles on Java 7.

int _, ‿, ⁀, ⁔, ︳, ︴, ﹍, ﹎, ﹏, _;

An example. In this case tp is the name of a column and the value for a given row.

Column<Double> ︴tp︴ = table.getColumn("tp", double.class);

double tp = row.getDouble(︴tp︴);

The following

for (int i = Character.MIN_CODE_POINT; i <= Character.MAX_CODE_POINT; i++)
    if (Character.isJavaIdentifierStart(i) && !Character.isAlphabetic(i))
        System.out.print((char) i + " ");
}

prints

$ _ ¢ £ ¤ ¥ ؋ ৲ ৳ ৻ ૱ ௹ ฿ ៛ ‿ ⁀ ⁔ ₠ ₡ ₢ ₣ ₤ ₥ ₦ ₧ ₨ ₩ ₪ ₫ € ₭ ₮ ₯ ₰ ₱ ₲ ₳ ₴ ₵ ₶ ₷ ₸ ₹ ꠸ ﷼ ︳ ︴ ﹍ ﹎ ﹏ ﹩ $ _ ¢ £ ¥ ₩