User:Waldo/Internationalization API: Difference between revisions

Lots more on language tags
(even earlier)
(Lots more on language tags)
Line 10: Line 10:


...talk about language tags and their structure and what's encoded in them, collators, date formats, and how all the stuff is implemented using what ICU primitives...copiously link to BCP47...
...talk about language tags and their structure and what's encoded in them, collators, date formats, and how all the stuff is implemented using what ICU primitives...copiously link to BCP47...
=== Language tags ===
Every operation is performed in terms of locales, specified as [http://tools.ietf.org/html/bcp47#section-2.1 language tags]: <code>en-US</code>, <code>nan-Hant-TW</code>, <code>und</code>, and so on.  The components of a language tag are the language and optionally a script, region, and variations that might exist within these; an optional private-use component may also be included at the end.  Each component is alphanumeric and case-insensitive.  The components are joined by hyphens; individual components can be distinguished by length and internal syntax (length, prefix, etc.).  For precise details of language tag structure, see [http://tools.ietf.org/html/bcp47#section-2.1 BCP 47]..
SpiderMonkey mostly ignores the language, script, region, and variant components of a language tag.  It will pass these components to ICU in language tags provided by the user, but it generally doesn't examine them, or do much of interest with them.  The one exception is for ''old-style language tags''.  '''XXX add details about the old-style mapping code in Intl.js, and why ICU doesn't perform that mapping itself'''
SpiderMonkey ''does'', however, sometimes have to (very briefly) care about the extension component of a language tag.  The extension component may include ''Unicode extensions'' that specify things like the particular collation (sorting) algorithm to use (phone-book name sorting versus dictionary order, numeric versus versus lexical for numbers <nowiki>[</nowiki>1 12 100 or 1 100 12<nowiki>]</nowiki>), the numbering system to use when formatting a number, and so on.  Some ECMA-402 algorithms require locales be considered with a Unicode extension component removed, so SpiderMonkey must sometimes remove them before continuing with a provided language tag.
=== ... ===
...


== Internationalization in SpiderMonkey ==
== Internationalization in SpiderMonkey ==
Confirmed users
446

edits