Through the REST API of an application, I receive language codes of the following form: ll-Xxxx
.
Some examples:
az-Arab Azerbaijani in the Arabic script
az-Cyrl Azerbaijani in the Cyrillic script
az-Latn Azerbaijani in the Latin script
sr-Cyrl Serbian in the Cyrillic script
sr-Latn Serbian in the Latin script
uz-Cyrl Uzbek in the Cyrillic script
uz-Latn Uzbek in the Latin script
zh-Hans Chinese in the simplified script
zh-Hant Chinese in the traditional script
From what I found online:
[ISO 639-1] is the first part of the ISO 639 series of international standards for language codes. Part 1 covers the registration of two-letter codes.
and
ISO 639-3 is an international standard for language codes. In defining some of its language codes, some are defined as macrolanguages [...]
Now I need to write a piece of code to verify that I receive a valid language code.
But since what I receive is a mix of 639-1 (2 letters language) and 639-3 (macrolanguage), what standard am I supposed to stick with ? Are these code belonging to some sort of mixed up (perhaps common) standard ?
Following RFC-5646 (at page 4) a language tag can be written with the following form : [language]-[script].