This document discusses the design and implementation of Mozilla's font subsystem. The particular focus is on Unicode and internationalization.
Mozilla has chosen Unicode as the internal character encoding. This was decided in part because HTML is based on Unicode. Although HTML documents exist in a variety of character encodings, numeric character references are defined in terms of ISO 10646 (a superset of Unicode). Other reasons for choosing Unicode are the fact that it is mostly fixed width, and can represent most of the world's characters.
The problem to be discussed here, then, is how to draw Unicode on a number of devices, particularly screens and printers. These devices are accessible from computers running a number of different OS's, e.g. Windows, MacOS, Unix. The details of the proposed solution to this Unicode problem are highly system dependent, and will be discussed here too.
Most fonts only offer glyphs for a subset of Unicode. Although there are fonts that contain a large subset of Unicode (e.g. Lucida Sans Unicode, Bitstream Cyberbit), these fonts do not always provide the stylistic properties that authors and users prefer. Hence, these fonts are often referred to as "last resort" fonts, to be used only when other, more desirable fonts are unavailable or do not contain the required glyphs.
A Unicode string may contain characters from a number of different parts of the world, or from a number of fields such as mathematics. It may be necessary to use a number of fonts to draw a particular Unicode string, switching from one font to another as we proceed. We will call this process "font switching".
CSS defines a property called font-family that contains an ordered list of fonts. These fonts are supposed to be tried in order, looking both for availability of the font itself, as well as availability of glyphs to draw the current text. Mozilla will have to implement these font lists in order to support CSS.
Prior to the advent of CSS, HTML documents were rendered using fonts that depended in part on the document character encoding (charset). Since both authors and users of such "old-style" documents have become accustomed to this behavior, Mozilla should adhere to this as much as possible. When an HTML document is not accompanied by CSS font rules, we should use a specially tailored font list where the first font is based on the document's charset.
This means favoring whatever font the user has chosen for Japanese, when the document is in a Japanese charset such as Shift_JIS (and there are no font specifications such as CSS or HTML's FONT FACE). The old browser stored font choices in the preferences file, and the new Mozilla could use this as is, or migrate the user's old values to whatever new preference file format we come up with.
Since CSS itself does not have the concept of assigning particular fonts to particular charsets, we are left with the dilemma of whether to base the new font preferences dialog on CSS's font-family lists or the old charset-based selection (or a combination of these). However, regardless of the eventual choice of UI, the GFX implementation will certainly need to support font switching, and so that is what this document will focus on initially.
Another problem is Unicode's Han unification. Unicode uses one set of characters for Chinese, Japanese, Korean and other Han languages. How do we know which font to use if the document is in Unicode? One way is to use HTML's LANG attribute. If the attribute for a particular span of text says "ja", then we can use a Japanese font for that span.
The original document also had "Proposed solution" and "To do" sections, but since they are out of date, they were not migrated here. Look in the original document for them.
Font selection/Default fonts has the list of default fonts (work in progress).
originally by Erik van der Poel