Some fonts on Google Web Fonts support multiple "character sets". The thing is, if the web font I use only serves the "latin" glyphs, users who translate the page to a language whose glyphs aren't supported will clearly notice the messed up text.
I'd like my web fonts to support the most popular languages in the world aside from English, for example, Spanish, German, French, etc.
For this purpose, I'd like to know, which languages exactly, the "latin" and "latin-extended" cater to, individually.
I expect the answer to look like:
Latin Character Set & Supported Languages:
- ..........
- ..........
- ..........
Latin-Extended Character Set & Supported Languages:
- ..........
- ..........
- ..........
I couldn't find this info in Google Web Fonts documentation, or by Googling.
aka Unicode Latin1-Supplement (U+0080 to U+00FF) is meant to support primarily Western European languages (as you mentioned French, German, Spanish, also Portuguese, Italian, Irish, Icelandic, languages of Scandinavian countries and unintentionally also other languages mentioned in the list below). English is supported by standard ASCII. ASCII (first 127 chars, 95 of them are graphemes U+0020 to U+007E) was placed as the very first block in Unicode named Basic Latin. This block is considered as a part of "Latin" and is usually supported even in non-latin fonts to correctly display the font name on latin-based systems.
Latin Extended on Google fonts means practically block Latin-Extended-A (U+0100 to U+017F) which should (combined with "Latin") support all common latin-written texts. Most languages using this block also use characters from "Latin", so "Latin-Extended" fonts usually contain superset of "Latin" characters, but it is not guaranteed.
In Unicode, there is also Latin-Exteded-B block which is needed in national alphabets for characters Ə, Ș, Ț (but these are often replaced with Ä, Ş, Ţ from Extended-A) and Vietnamese Ơ, Ư (but this has its own category on Google fonts).
African Latin languages are supported by Unicode Latin-Extended-B and Latin-Extended-Additional blocks, but these are mostly not supported by Google's Latin Extended category. There are even more exotic C, D and E extensions (252 characters total), but I haven't seen them in real life, so I guess Google also doesn't count them in their Latin Extended category.
From my observation Google places font into Latin Extended category if it contains some, but not necessarily all characters from Latin-Extended-A block. Webfonts need to be small not to slow page loading (woff/woff2 format is preferred). The more characters the font contains, the bigger its size (fonts covering whole BMP can grow above 10 MB). The author often describes the purpose of his/her font, so only he/she can describe the the logic behind the character support. For example, Lato Google font supports only Polish characters from Latin Extended A block (the author is a Pole), yet it is in Google's "Latin Extended" category. To find out whether the font supports specific language, try to display characters from the list below.
From the list of latin-written alphabets below inspected on Omniglot and other sources, I do not count:
Please comment if something important is missing or if some minority language is used in electronic communication.
ASCII (Basic Latin, often supported even in non-latin fonts)
Clasical Latin, Afrikaans, Asturian, Corsu, Dutch, Greenlandic, Gaelic, Haitian (Creolic), Malay, Shona, Sicilian, Swahili.
English is also supported, with addition of handy '¢' (American) and '£' (British) from Latin1 Supplement, although other currency symbols (like '€') were added much later: since Unicode 2.0 in 1998, in block starting 0x20A0).
Latin
Latin Extended
Latin Extended, African (mostly not supported in Latin-Extended fonts). Full support of Africa alphabet has Ubuntu, Fira Sans, EB Garamond, Tinos, News Cycle, Didact Gothic, M Plus, Sawarabi, Cousine, Caudex, Judson, Andika (and of course Noto, see below)
Alternatively, the font may support the Combining Diacritical Marks block: U+0300 to U+036F. For example, Ř can be typed either as U+0158 (aka precomposed character) or as R + U+030C. Program supporting Unicode should both display and treat the same as a standalone character, but if the program or font doesn't support repertoire, the combining diacritical mark might end up a bit misplaced (like too low Ɛ̈ here on my system), see this very detailed Unicode Q&A on this topic.
You might want to customize some fonts (if their licence allows it) by Font Squirrel service or use them as backup. There are wide support free fonts to start with: