Java. Ignore accents when comparing strings

framara picture framara · Mar 3, 2010 · Viewed 24.4k times · Source

The problem it's easy. Is there any function in JAVA to compare two Strings and return true ignoring the accented chars?

ie

String x = "Joao";
String y = "João";

return that are equal.

Thanks

Answer

DaveJohnston picture DaveJohnston · Mar 3, 2010

I think you should be using the Collator class. It allows you to set a strength and locale and it will compare characters appropriately.

From the Java 1.6 API:

You can set a Collator's strength property to determine the level of difference considered significant in comparisons. Four strengths are provided: PRIMARY, SECONDARY, TERTIARY, and IDENTICAL. The exact assignment of strengths to language features is locale dependant. For example, in Czech, "e" and "f" are considered primary differences, while "e" and "ě" are secondary differences, "e" and "E" are tertiary differences and "e" and "e" are identical.

I think the important point here (which people are trying to make) is that "Joao"and "João" should never be considered as equal, but if you are doing sorting you don't want them to be compared based on their ASCII value because then you would have something like Joao, John, João, which is not good. Using the collator class definitely handles this correctly.