Cleaner way to check if a string is ISO country of ISO language in Java

mat_boy picture mat_boy · Apr 10, 2013 · Viewed 13.7k times · Source

Suppose to have a two-characters String, which should represent the ISO 639 country or language name.

You know, Locale class has two functions getISOLanguages and getISOCountries that return an array of String with all the ISO languages and ISO countries, respectively.

To check if a specific String object is a valid ISO language or ISO country I should look inside that arrays for a matching String. Ok, I can do that by using a binary search (e.g. Arrays.binarySearch or the ApacheCommons ArrayUtils.contains).

The question is: exists any utility (e.g. from Guava or Apache Commons libraries) that provides a cleaner way, e.g. a function that returns a boolean to validate a String as a valid ISO 639 language or ISO 639 Country?

For instance:

public static boolean isValidISOLanguage(String s)
public static boolean isValidISOCountry(String s)

Answer

Jon Skeet picture Jon Skeet · Apr 10, 2013

I wouldn't bother using either a binary search or any third party libraries - HashSet is fine for this:

public final class IsoUtil {
    private static final Set<String> ISO_LANGUAGES = new HashSet<String>
        (Arrays.asList(Locale.getISOLanguages()));
    private static final Set<String> ISO_COUNTRIES = new HashSet<String>
        (Arrays.asList(Locale.getISOCountries()));

    private IsoUtil() {}

    public static boolean isValidISOLanguage(String s) {
        return ISO_LANGUAGES.contains(s);
    }

    public static boolean isValidISOCountry(String s) {
        return ISO_COUNTRIES.contains(s);
    }
}

You could check for the string length first, but I'm not sure I'd bother - at least not unless you want to protect yourself against performance attacks where you're given enormous strings which would take a long time to hash.

EDIT: If you do want to use a 3rd party library, ICU4J is the most likely contender - but that may well have a more up-to-date list than the ones supported by Locale, so you would want to move to use ICU4J everywhere, probably.