Confirmed users
446
edits
No edit summary |
m (Slight changes/elaboration of the Language Tag section info) |
||
| Line 19: | Line 19: | ||
One particular subcomponent worth noting specifically is the ''Unicode extension component'', living within the extension component. The Unicode extension component has the basic form <code>"-u(-[a-z0-9]{2,8})+"</code>, with precise details in [https://tools.ietf.org/html/rfc6067 RFC 6067]. The Unicode component permits specifying additional details about sort order, numeric system, calendar system, and others. | One particular subcomponent worth noting specifically is the ''Unicode extension component'', living within the extension component. The Unicode extension component has the basic form <code>"-u(-[a-z0-9]{2,8})+"</code>, with precise details in [https://tools.ietf.org/html/rfc6067 RFC 6067]. The Unicode component permits specifying additional details about sort order, numeric system, calendar system, and others. | ||
SpiderMonkey verifies structural validity of language tags and brings them into a canonical form | SpiderMonkey verifies structural validity of language tags and brings them into a canonical form but generally doesn't interpret the components of a language tag. Instead SpiderMonkey passes the tag to ICU, and ICU interprets the components. | ||
One exception is for ''old-style language tags'': A small set of language tags for languages that according to BCP 47 don't have a default script, but that are commonly used without the script code based on older standard ([https://tools.ietf.org/html/rfc1766 RFC 1766] and [https://tools.ietf.org/html/rfc3066 RFC 3066]) that didn't recognize script codes. The implementation maps such language tags to their modern equivalents so that they can be found in the lists of available locales provided by ICU. | |||
Another exception is when determining language fallback, as required by ECMA-402: a language is requested that's not supported, but the language tag's internal structure implicitly encodes a list of fallbacks. For example, the tag <code>en-US</code> suggests a fallback to <code>en</code>. | |||
The last exception is that, again as required by ECMA-402, SpiderMonkey removes the Unicode extension component of a language tag from the base language tag during processing. The key-value pairs in the Unicode extension are compared separately against the feature set supported by the language found, and the language tag sans Unicode extension is used by ICU after the feature set is determined. | |||
=== Currency codes === | === Currency codes === | ||