I have this code. It sorts correctly in French and Russian. I used Locale.US and it seems to be right. Is this solution do right with all languages out there? Does it work with other languages? For example: Chinese, Korean, Japanese... If not, what is the better solution?
public class CollationTest {
public static void main(final String[] args) {
final Collator collator = Collator.getInstance(Locale.US);
final SortedSet<String> set = new TreeSet<String>(collator);
set.add("abîmer");
set.add("abîmé");
set.add("aberrer");
set.add("abhorrer");
set.add("aberrance");
set.add("abécédaire");
set.add("abducteur");
set.add("abdomen");
set.add("государственно-монополистический");
set.add("гостить");
set.add("гостевой");
set.add("гостеприимный");
set.add("госпожа");
set.add("госплан");
set.add("господи");
set.add("господа");
for(final String s : set) {
System.out.println(s);
}
}
}
Update: Sorry, I don't require this set must contain all languages in order. I mean this set contain one language and sort correctly in every languages.
public class CollationTest {
public static void main(final String[] args) {
final Collator collator = Collator.getInstance(Locale.US);
final SortedSet<String> set = new TreeSet<String>(collator);
// Sorting in French.
set.clear();
set.add("abîmer");
set.add("abîmé");
set.add("aberrer");
set.add("abhorrer");
set.add("aberrance");
set.add("abécédaire");
set.add("abducteur");
set.add("abdomen");
for(final String s : set) {
System.out.println(s);
}
// Sorting in Russian.
set.clear();
set.add("государственно-монополистический");
set.add("гостить");
set.add("гостевой");
set.add("гостеприимный");
set.add("госпожа");
set.add("госплан");
set.add("господи");
set.add("господа");
for(final String s : set) {
System.out.println(s);
}
}
}
Because of every language has its own alphabetic order you can not. For example,
Russian language as you stated has с
letter has a different order than Turkish language.
You should always use collator. What I can suggest you is to us Collection API.
//
// Define a collator for German language
//
Collator collator = Collator.getInstance(Locale.GERMAN);
//
// Sort the list using Collator
//
Collections.sort(words, collator);
For futher information check and as stated here
This program shows what can happen when you sort the same list of words with two different collators:
Collator fr_FRCollator = Collator.getInstance(new Locale("fr","FR"));
Collator en_USCollator = Collator.getInstance(new Locale("en","US"));
The method for sorting, called sortStrings, can be used with any Collator. Notice that the sortStrings method invokes the compare method:
public static void sortStrings(Collator collator,
String[] words) {
String tmp;
for (int i = 0; i < words.length; i++) {
for (int j = i + 1; j < words.length; j++) {
if (collator.compare(words[i], words[j]) > 0) {
tmp = words[i];
words[i] = words[j];
words[j] = tmp;
}
}
}
}
The English Collator sorts the words as follows:
peach péché pêche sin
According to the collation rules of the French language, the preceding list is in the wrong order. In French péché should follow pêche in a sorted list. The French Collator sorts the array of words correctly, as follows:
peach pêche péché sin