Labs/Ubiquity/Ubiquity 0.5 Command Localization Tutorial

From MozillaWiki
< Labs‎ | Ubiquity
Jump to: navigation, search

Introduction

Ubiquity 0.5 adds the ability to localize commands bundled with Ubiquity and, in addition, lays the foundation for the future localization of community commands. The "localization of Ubiquity commands" involves the translation of verb names, description and help strings, and message and interface strings in the commands' preview and execute codes.

For information on the localization of the Ubiquity Parser—i.e., teaching Ubiquity the grammar of your language—please read this tutorial. In general, localizing the Parser is a prerequisite to the localization of individual commands.

Gettext and the po format

Ubiquity command localization follows the po (portable object) format of the GNU gettext system. po is a de facto standard in the world of localization, particularly in the UNIX world, and is supported by a variety of different tools and editors.

Like most types of localization, Ubiquity command localization works essentially by replacing strings. The original commands are written without regard for other languages, as long as they follow certain guidelines to keep the command localizable. The commands are written with strings in a set source language and, through the process of localization, Ubiquity will go through and replace those source language strings with the target language equivalents. In the case of Ubiquity's built-in commands, the source language is always English.

Here is one translation entry from a Ubiquity command's po file:

 msgctxt "twitter.execute"
 msgid "direct message sent"
 msgstr "ダイレクトメッセージを送信しました。"

Every* Ubiquity translation entry consists of these three parts:

  1. msgctxt, the message context. This is a structured string which tells you which command's string this is, and which aspect of the command's behavior it is related to. Here, this translation entry is from the execute code of the twitter command.
  2. msgid, the message id. This is the original text in the source language, so you know exactly what content needs to be translated. This must exactly match the localized string in the command code.
  3. msgstr, the message string. This is the localized string in the target language, here Japanese. As expected, the msgstr above says "direct message was sent" in Japanese.

* almost every - see "shared keys" below.

Ubiquity's localization files for built-in commands are all stored in a central directory, and are organized by command feed. For example, the localization of the firefox.js command feed must be called firefox.po and is placed in the directory ubiquity/localization/XX/, where XX is your language's language code. For example, the Danish localization file for firefox.js would be stored as ubiquity/localization/da/firefox.po. (For more information, see the "Testing your localization" section below.)

po files may be translated in a text editor or in specialized localization software. Some free tools include poEdit and Translate Toolkit. All Ubiquity po files are in UTF8 encoding. Traditional Gettext also uses a binary format called mo (machine object) but Ubiquity only uses po files.

Localization templates

po localizations are often started based on a po template or pot file. These templates are simply po files with all blank msgstr's.

Ubiquity includes a handy localization template tool to create these initial templates for you. Just go to the Ubiquity command list, and click on the "get localization template" link next to a command feed. If no such link is showing up, it means that that feed was not bundled with Ubiquity and thus does not currently support localization.

You can also get many of the pot files pre-generated from the Ubiquity hg repository. If you already get your Ubiquity from the source, you'll find the templates in ubiquity/localization/templates. Else, you can get them from our hg server.

When you start working on a localization, make sure to add your contact and credit information to the header. Each Ubiquity po file begins with a header that looks like this:

 # social.po
 #
 # Localizers:
 # Masahiko Imanaka <test@yahoo.co.jp>

This is from the Japanese po file for social.js, social.po. You'll see a line called "Localizers:". Add a new line (or replace the dummy line) below that and add your name, followed by your email address in <braces>.

Please note that these automatically generated localization templates are not perfect. For technical reasons, there are often a handful of localizable strings which do not get automatically added into the template. When you notice these, you can simply duplicate one of the msgctxt-msgid-msgstr entries and fill in the appropriate details. This will often require that you take a look at the original command feed's source.

Examples and special cases

Two types of strings

There are, broadly speaking, two types of translation entries in Ubiquity command localization: properties and inline strings.

Properties

Properties are metadata about the command that are in individual properties of command objects. Localizable properties include names, help, and description. As each command only has one each of these properties, there is only one msgcntxt of each kind. For example, a command like twitter would never have two different twitter.names localization entries. Even though it is logically somewhat redundant, these property entries still need to keep both their msgctxt and their msgid's to function properly.

Note in particular that the names translation may be a plurality of names, not just one. In this case, use the pipe (|) character to delimit the names:

 #. twitter command:
 #. use | to separate multiple name values:
 msgctxt "twitter.names"
 msgid "twitter|tweet|share using twitter"
 msgstr "呟く|呟いて|呟け|つぶやく|つぶやいて|つぶやけ|twitter|tweet"

Inline strings

Inline strings are those strings which are used in a command's preview or execute methods. As such, they always have msgctxt of "command name.preview" or "command name.execute". As there may be many different localizable strings in each of these methods, there can be multiple different translation entries in commands with this same msgctxt but they will each have unique msgid's.

Multiline strings

Often the strings to localize—and their localizations—will be multiple lines long. In this case, the po format offers a special syntax to deal with such line breaks:

 msgctxt "digg.description"
 msgid "If not yet submitted, submits the page to Digg.\n"
 "Otherwise, it takes you to the story's Digg page."
 msgstr "このページを Digg にたれこみます。\n"
 "または該当する Digg ページを開きます。"

The convention here is that if a line begins with a quote ("), it is the continuation of the line before it. In this case, newline characters (encoded \n) are not automatically inserted, so a \n must be inserted at the end of the line to mark that there is a newline there. Note that the msgid must match what is in the source code exactly, so it is best to leave the msgid's as they are in the templates.

There is a known bug in Ubiquity 0.5 preventing these multi-line keys from being properly dealt with.

Shared keys

In some situations, a command author will write some shared code which is executed as part of the command's preview and execute or even between commands.

In this special case, as it would be redundant to write the exact same translation entry twice for both preview and execute contexts, you can optionally make do without the msgctxt:

 # no message context
 msgid "original message"
 msgstr "translation"

By not specifying a context, this same translation entry can be shared across any instance of the string "original message" in any preview or execute string in any command in that feed. You cannot, however, use shared keys to share translations between command feeds.

Note however that there are instances in some languages where you would actually want to localize the two contexts' strings separately: for example, suppose the source language does not mark tense or aspect, but your target language does. A status message in that language may be the same in cases where the action is about to be done (in the preview) and when it was completed (in the execute) and you would thus want to translate these strings differently in the different (future/past) contexts.

Localizing formatting templates

Localizable strings will often contain variable references and other code in {curly braces}, following the JavaScript Templates format. For example, you may see a string like this:

 msgid "${number} results found."

In these cases, the ${number} is going to be replaced out by some data, so you will want to leave it alone in your localization, for example (French):

 msgstr "Il y a ${number} résultats."

JavaScript templates can also include some basic control structures, most importantly letting you handle pluralization directly in the . Suppose you want your localized string to display slightly differently depending on whether the number value is singular or plural. The JavaScript Templates format allows for a simple {if} statement to handle these possibilities:

 msgstr "Il y a ${number} résultat{if number > 1}s{/if}."

This string will produce "Il y a 3 résultats." when plural and "Il y a 1 résultat." when singular as we would like. You can also write an {else} condition, using the pattern

 {if ...} ... {else} ... {/if}

Alternatively, suppose the source language includes such special syntax for pluralization, but your language does not have plural marking. You can simply remove the {if}... control structure from the localized string.

 msgid "${number} result{if number > 1}s{/if} found."
 msgstr "找到${number}個結果"

Testing your localizations

Manual testing is an important step in preparing your localizations. Testing requires you to place your po files in a specific language directory in your Ubiquity source folder. Where this folder is depends on how you obtained Ubiquity:

  • for Ubiquity installs via xpi (addons.mozilla.com):
    • If you installed Ubiquity via a packaged xpi, such as from addons.mozilla.com or from the "Add-ons" menu item in Firefox, your Ubiquity folder is in your Firefox profile. This support tutorial will tell you where to find your Firefox profile. Within your profile drill down to extensions/ubiquity@labs.mozilla.com/. This is your Ubiquity source folder.
  • for Ubiquity installs from hg:
    • If you pulled the Ubiquity source from our hg repository and installed it using the manage.py utility, you will find the Ubiquity source folder, ubiquity right in the root level of the repository.

Once you've found your Ubiquity source directory, you want to find your language's directory at localizations/XX/ where XX is your language code. If the directory doesn't exist, you can make it. Place your po files there.

Finally, go to the Ubiquity settings page and make sure it's set to use your language (if you haven't already).* Restart to test out your localizations.

* currently there's no way to test localizations for languages which do not yet have parsers for them. Read this tutorial to learn more about how to write parser language settings for your language.

Contributing your localizations

Currently the latest versions of Ubiquity built-in command localizations are kept in Ubiquity's main hg repository. You can sumbit new localizations by posting them on this trac ticket.

You can also ask questions and seek help on Ubiquity localization on the The Ubiquity i18n Google Group.

References