Which of utf8 collations is the best?

armin etemadi picture armin etemadi · Apr 24, 2010 · Viewed 27k times · Source

I want a UTF8 collation for supporting:

  • English
  • Persian
  • Arabic
  • French
  • Japanese
  • Chinese

Does UTF8_GENERAL_CI support all these Languages?

Answer

knittl picture knittl · Apr 24, 2010

Yes, that is correct. UTF-8 is an encoding for the Unicode character set, which supports pretty much every language in the world.

I think the only difference comes with sorting your results, different letters might come in a different order in other languages (accents, umlauts, etc.). Also, comparing a to ä might behave differently in another collation.

The _ci suffix means sorting and comparison happens case insensitive.

http://www.collation-charts.org/ might be of interest to you.