User:Waldo/Internationalization API: Difference between revisions

Some more organization work
(Talk more about ICU, split key concepts into separate concepts/operations sections)
(Some more organization work)
Line 32: Line 32:


ECMA-402 in its first iteration exposes various locale-sensitive operations.
ECMA-402 in its first iteration exposes various locale-sensitive operations.
...talk about collators, date formats, and the remaining stuff...copiously link to BCP47...


=== Collation ===
=== Collation ===
Line 50: Line 52:


ICU's source code is relatively huge and sprawling: hardly surprising for a 15+ year old project.  {{source|intl/icu/source/common/unicode/}} is probably the most interesting directory, from SpiderMonkey's point of view, as it contains the public headers and interfaces used by SpiderMonkey.  Each header and interface within contains copious documentation of the behavior of the function/enum/etc. in question.  The documentation isn't always perfectly clear, but quite often it's enough to know how to use the functionality without having to read the implementation.
ICU's source code is relatively huge and sprawling: hardly surprising for a 15+ year old project.  {{source|intl/icu/source/common/unicode/}} is probably the most interesting directory, from SpiderMonkey's point of view, as it contains the public headers and interfaces used by SpiderMonkey.  Each header and interface within contains copious documentation of the behavior of the function/enum/etc. in question.  The documentation isn't always perfectly clear, but quite often it's enough to know how to use the functionality without having to read the implementation.
ICU provides both C and C++ APIs, but only the C API is considered stable.  Given that some people reasonably want to use SpiderMonkey with a system ICU, this means we're generally limited to only the stable C API.  (In one case we have to use the C++ API to access functionality; see known issues below.)  Unfortunately, this also means we have to hand-roll our own smart pointer for managing ICU resources.
Most of the ICU methods indicate errors through an error code outparam.  Also, such APIs check the existing value in that outparam before proceeding.  Thus a sequence of ICU calls can occur without error-checking right up til the end, where a single <code>U_FAILURE(status)</code> will suffice to handle all errors that might occur.  For example:
ucol_setAttribute(coll, UCOL_STRENGTH, uStrength, &status);
ucol_setAttribute(coll, UCOL_CASE_LEVEL, uCaseLevel, &status);
ucol_setAttribute(coll, UCOL_ALTERNATE_HANDLING, uAlternate, &status);
ucol_setAttribute(coll, UCOL_NUMERIC_COLLATION, uNumeric, &status);
ucol_setAttribute(coll, UCOL_NORMALIZATION_MODE, uNormalization, &status);
ucol_setAttribute(coll, UCOL_CASE_FIRST, uCaseFirst, &status);
if (U_FAILURE(status)) {
    ucol_close(coll);
    JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL, JSMSG_INTERNAL_INTL_ERROR);
    return NULL;
}


==== Integration ====
==== Integration ====
Line 73: Line 91:
Tests live in {{source|js/src/tests/test402}}, an unmodified import of the ECMA-402 test suite.  '''XXX Explain how the tests are run, how they're skipped in no-<code>Intl</code> builds, how to update them, how to contribute to them, how we disable/annotate any tests we don't pass'''
Tests live in {{source|js/src/tests/test402}}, an unmodified import of the ECMA-402 test suite.  '''XXX Explain how the tests are run, how they're skipped in no-<code>Intl</code> builds, how to update them, how to contribute to them, how we disable/annotate any tests we don't pass'''


== Structures ==
=== Implementation ===
...talk about collators, date formats, and how all the stuff is implemented using what ICU primitives...copiously link to BCP47...


=== Known bugs and issues ===
...talk about how all the stuff is implemented using what ICU primitives...
 
=== Known issues ===


ECMA-402 says that the supported numbering systems for a locale are (unsurprisingly) locale-dependent.  ICU exposes the default numbering system for a locale via a C++ API, but otherwise it pretends any numbering system can be used by any locale.  Thus SpiderMonkey's implementation says that the default numbering system is supported (obviously), and it says a handful of common decimal numbering systems are supported.  See <code>getNumberingSystems</code> in {{source|js/src/builtin/Intl.cpp}}.  If ICU ever provides more comprehensive information here, we should probably use it.
ECMA-402 says that the supported numbering systems for a locale are (unsurprisingly) locale-dependent.  ICU exposes the default numbering system for a locale via a C++ API, but otherwise it pretends any numbering system can be used by any locale.  Thus SpiderMonkey's implementation says that the default numbering system is supported (obviously), and it says a handful of common decimal numbering systems are supported.  See <code>getNumberingSystems</code> in {{source|js/src/builtin/Intl.cpp}}.  If ICU ever provides more comprehensive information here, we should probably use it.
Line 84: Line 103:
=== Other random details ===
=== Other random details ===


ICU provides both C and C++ APIs, but only the C API is considered stable. Given that some people reasonably want to use SpiderMonkey with a system ICU, this means we're generally limited to only the stable C API. Unfortunately, this also means we have to hand-roll our own smart pointer for managing ICU resources.
....anything?...
 
Most of the ICU methods indicate errors through an error code outparam. Also, such APIs check the existing value in that outparam before proceeding. Thus a sequence of ICU calls can occur without error-checking right up til the end, where a single <code>U_FAILURE(status)</code> will suffice to handle all errors that might occur.
Confirmed users
446

edits