User:Waldo/Internationalization API
Introduction
ECMAScript has long had rudimentary localization support. ES5 defines toLocaleString methods (found on various objects like Array.prototype, Number.prototype, and Date.prototype); toLocaleLowerCase and toLocaleUpperCase on String.prototype; and toLocaleDateString and toLocaleTimeString on Date.prototype. Each method acts only with respect to the user's current locale, and each method provides no control over output formatting. The spec algorithms are woefully under-defined. As a practical matter localization support in ES5 is useless.
The ECMAScript Internationalization API (ECMA-402) significantly extends these capabilities, to provide genuinely useful means of localization to ECMAScript. Outputs may be customized in various ways by requesting different components be included in output, formatted in various ways. The locale used for a formatting operation is customizable, and output formatting is intelligently determined in accordance with the locale. It additionally provides means of locale-sensitively sorting data, according to the type of that data (for example, sorting names in phone book order, versus sorting them in dictionary order), considering or ignoring capitalization, accents, and so on.
The Internationalization API introduces one new global property: Intl. This property is an object with various properties corresponding to various sub-APIs: collation (sorting), number formatting, and date/time formatting. (More capabilities will be added in future Internationalization API updates.) The localization APIs from ES5 have been reformulated to use the localization capabilities provided by the Internationalization API. Generally, however, it's preferable to use the Internationalization API directly, as this is more efficient by permitting caching of the structures needed to perform each operation.
Internationalization in SpiderMonkey
SpiderMonkey includes significant support for the Internationalization API. The fundamental primitives used to implement the API are provided by an in-tree imported copy of ICU in intl/icu. This is an optional component of a SpiderMonkey build; support may be turned on using the --enable-intl-api configuration option. The Internationalization API is enabled by default in Firefox builds. Features and capabilities of the API itself are implemented in both C++ and in self-hosted JavaScript that accesses ICU functionality through various intrinsic functions in the self-hosting global.
Code organization
Integration
The Intl object is integrated into the global object through code in js/src/builtin/Intl.cpp and js/src/builtin/Intl.h. js_InitIntlClass performs this operation when it's called during global object bootstrapping, in concert with various other initialization methods in the same file and in js/src/vm/GlobalObject.cpp.
The Internationalization API is added to SpiderMonkey in the js/src/builtin/Intl.cpp and js/src/vm/Intl.h headers. These files integrate the Intl object into the global object, define This is primarily integration code to add Intl to the global object.
Self-hosted code
The majority of the self-hosted code implementing Internationalization is in js/src/builtin/Intl.js. This file defines the functions exposed on the various Intl.* constructor functions and the various Intl.*.prototype objects.
Internationalization in various cases requires keeping around large data tables: to record the set of supported currency codes, to record language tag (hyphenated strings describing locales, and various options) mappings, and so on. This data lives in js/src/builtin/IntlData.js and is generated by js/src/builtin/make_intl_data.py. This script downloads original (large) plaintext databases, parses them, and extracts in the proper format the data used by Internationalization.
Key concepts
...