Reputation: 1585
Given a ISO 639-2/T language code of scope individual, how can i programmatically find the matching macrolanguage code, if such as match exists?
For example, how to go from "nob" (Norwegian Bokmål, scope individual) to "nor" (Norwegian, scope macrolangauge)?
In general, there can be multiple individual languages that are not part of the same macrolanguage in the same country, so grouping by country alone will give false positives.
java.util.locale knows about ISO 639 three letter language codes and recognizes both codes in the example above, but doesn't have the concept of scope nor macrolanguage.
A heuristic, without false positives is also helpful in my case.
Upvotes: 4
Views: 299
Reputation: 121
You could make a list of macro language of your own, and corresponding individual languages.
Here's a selection I made some time ago:
public static final Map<String, String> macroLanguages = new HashMap<>();
static {
macroLanguages.put("aao", "ara"); //https://iso639-3.sil.org/code/ara
macroLanguages.put("abh", "ara");
macroLanguages.put("abv", "ara");
macroLanguages.put("acm", "ara");
macroLanguages.put("acq", "ara");
macroLanguages.put("acw", "ara");
macroLanguages.put("acx", "ara");
macroLanguages.put("acy", "ara");
macroLanguages.put("adf", "ara");
macroLanguages.put("aeb", "ara");
macroLanguages.put("aec", "ara");
macroLanguages.put("afb", "ara");
macroLanguages.put("ajp", "ara");
macroLanguages.put("apc", "ara");
macroLanguages.put("apd", "ara");
macroLanguages.put("arb", "ara");
macroLanguages.put("arq", "ara");
macroLanguages.put("ars", "ara");
macroLanguages.put("ary", "ara");
macroLanguages.put("arz", "ara");
macroLanguages.put("auz", "ara");
macroLanguages.put("avl", "ara");
macroLanguages.put("ayh", "ara");
macroLanguages.put("ayl", "ara");
macroLanguages.put("ayn", "ara");
macroLanguages.put("ayp", "ara");
macroLanguages.put("bbz", "ara");
macroLanguages.put("pga", "ara");
macroLanguages.put("shu", "ara");
macroLanguages.put("ssh", "ara");
macroLanguages.put("ekk", "est"); //https://iso639-3.sil.org/code/est
macroLanguages.put("vro", "est");
macroLanguages.put("bos", "hbs"); //https://iso639-3.sil.org/code/hbs
macroLanguages.put("hrv", "hbs");
macroLanguages.put("srp", "hbs");
macroLanguages.put("cnr", "hbs");
macroLanguages.put("ltg", "lav"); //https://iso639-3.sil.org/code/lav
macroLanguages.put("lvs", "lav");
macroLanguages.put("nno", "nor"); //https://iso639-3.sil.org/code/nor
macroLanguages.put("nob", "nor");
macroLanguages.put("aae", "sqi"); //https://iso639-3.sil.org/code/sqi
macroLanguages.put("aat", "sqi");
macroLanguages.put("aln", "sqi");
macroLanguages.put("als", "sqi");
macroLanguages.put("ydd", "yid"); //https://iso639-3.sil.org/code/yid
macroLanguages.put("yih", "yid");
macroLanguages.put("ccx", "zha"); //https://iso639-3.sil.org/code/zha
macroLanguages.put("ccy", "zha");
macroLanguages.put("zch", "zha");
macroLanguages.put("zeh", "zha");
macroLanguages.put("zgb", "zha");
macroLanguages.put("zgm", "zha");
macroLanguages.put("zgn", "zha");
macroLanguages.put("zhd", "zha");
macroLanguages.put("zhn", "zha");
macroLanguages.put("zlj", "zha");
macroLanguages.put("zln", "zha");
macroLanguages.put("zlq", "zha");
macroLanguages.put("zqe", "zha");
macroLanguages.put("zyb", "zha");
macroLanguages.put("zyg", "zha");
macroLanguages.put("zyj", "zha");
macroLanguages.put("zyn", "zha");
macroLanguages.put("zzj", "zha");
macroLanguages.put("cdo", "zho"); //https://iso639-3.sil.org/code/zho
macroLanguages.put("cjy", "zho");
macroLanguages.put("cmn", "zho");
macroLanguages.put("cpx", "zho");
macroLanguages.put("czh", "zho");
macroLanguages.put("czo", "zho");
macroLanguages.put("gan", "zho");
macroLanguages.put("hak", "zho");
macroLanguages.put("hsn", "zho");
macroLanguages.put("lzh", "zho");
macroLanguages.put("mnp", "zho");
macroLanguages.put("nan", "zho");
macroLanguages.put("wuu", "zho");
macroLanguages.put("yue", "zho");
macroLanguages.put("cnp", "zho");
macroLanguages.put("csp", "zho");
macroLanguages.put("pes", "fas"); //https://iso639-3.sil.org/code/fas
macroLanguages.put("prs", "fas");
}
Upvotes: 2