L10n:Managing Assets
Each L10n team benefits from using and maintaining key L10n assets in their work. These assets comprise of (but are not limited to) style guides, glossaries, termbases, and translation memories (TM). Each one of these promotes standardization and leveraging already performed work between projects and team members.
If you're visiting this page, we expect you're interested in learning how to create and maintain your own team's L10n assets. We'll discuss some details about each of these asset types and why they're useful. We'll also provide examples of each of these to help you create your own. Finally, we'll discuss some ideas on maintaining these assets and making them available to all of the members of your team.
Localization assets
Style guides
Style guides establish the appropriate spelling, punctuation, grammar, and tone that your L10n team is to follow when translating source content. This can also include how-to's on approaching brand names, idioms, formality, cultural references, and the right way to express currency and date/time for your region. Style guides can also be a useful form of instructing new community members with limited translation experience on meaning-based translation approaches.
Having an established style guide for your community will help you to standardize the elements mentioned above across a variety of projects and contributors. If you're wondering how to improve the quality of translation and writing within your projects, having a community style guide will help immensely.
Glossaries
Glossaries can contain officially approved translations for terms which are common, specific, and unique. They may also offer definitions for commonly misunderstood source terms. A glossary is most commonly used as an external resource. By external, I mean that it is a separate resource from the tool (TM tool, text editor, etc.) a translator uses to translate content. As an external resource, glossaries are created and maintained outside of the translator’s working environment. Glossaries typically fall into two categories: monolingual and bilingual. Monolingual glossaries providing definitions for select terms; bilingual glossaries providing translations of terms between two languages.
Glossaries can be maintained and made publicly available via wiki pages, team web sites, PDFs, CSV (comma separated value), or spreadsheet documents. Some projects are working on simple, open, XML-based files formats as well (like GlossML). However, these have not become popular enough to be functional in the vast majority of translation tools, thus still requiring maintenance work to take place externally.
Termbases
Termbases can contain officially approved translations for terms which are common, specific, and unique. They may also offer definitions for commonly misunderstood source terms. A termbase is most commonly used within a TM tool, making it part of the translator’s working environment. Being an internal resource, the TM tool can auto-suggest term translations from the termbase as the translator works.
Termbases are also very robust. A single term entry for a source term can contain any of the following attributes:
- a term definition,
- a translation,
- a contextual example of the term in use,
- and a specific domain in which the given definition and translation apply (allowing for multiple entries of the same term according to its various definitions).
Since termbases are XML-based, you can add further metadata to each entry, such as:
- identifying who added the term,
- when the term was last modified,
- or assigning it an ID number.
Termbases can be created and modified both inside and outside a TM tool. The standard file format for termbases is TBX (TermBase eXchange).
Translation memories
Translation memories (TM) record and save your translated content alongside its corresponding source content. This allows you to re-use your translations in future releases of a project or to leverage already translated strings when authors make last-minute changes to source content.
The file format standard for translation memories is TMX (Translation Memory eXchange) version 2.0.
Creating and maintaining your assets
Style guides
When it comes to creating style guides, it can be difficult to know what to include and what not to include. A general rule to follow when determining what should be included in a style guide is to be as detailed as possible. Here are some on suggestions for drafting your community's style guide:
- Determine your style guide’s target audience(s). This will help you to establish tone, spelling, case, character style (use of italics, bold, etc.) and grammar rules to meet your target audience's needs.
- Identify some unique cultural elements in the source culture and determine creative approaches to expressing them in your language.
- Discuss strategies for translating idioms, brand names, when a term or phrase should remain untranslated.
- Discuss the importance of preserving the source content's meaning in your language. Identify strategies and examples of successful meaning-based translation.
- If you’re unsure what to include, look at what others have done for their style guides (see resource links below).
- Draft your style guide in an area where you can receive community feedback.
Some communities (the Catalan community, for example) use wikis to draft and publish their style guides. Since wikis are fairly common these days and easy to edit, many people in the community can add their own style guidelines. Author your community's style guide in whatever format is easiest for you to edit should you need to modify your existing guidelines.
Examples:
- TED style guide for translators
- MDN style guide
- Mozilla Catalan L10n team's style guide
- Mozilla Italia L10n team's style guide
- Mozilla Basque L10n team's style guide
- Google style guide for translators
- Microsoft language style guides
- European Commission's language style guides
- Translation style guide for The World Bank
Glossaries
For glossaries, most communities refer to Transvision or other web-based glossaries for their glossary needs. This is a simple approach as you are not required to create or maintain it yourself. However, it is also a good idea to great your own community glossary simply because we at Mozilla often use unique terms that only fit without our particular domain. In addition, creating your own glossary will give you the power to add, remove, and edit any monolingual or bilingual entries.
Here are some helpful tips on creating and maintaining your own community glossary:
- Create and maintain yours in an accessible format for quick use, like a wiki or shared spreadsheet.
- Determine if you need a monolingual or bilingual glossary (or maybe both).
- Take a look at what other communities have done.
- Ask on the new-locales discussion group what terms other communities have had to define and translate in their own glossaries.
- Pay attention to the world-ready mailing list, as many unique marketing and brand name terms will be discussed there.
Examples:
- Mozilla Catalan L10n team's glossary
- Transvision multilingual glossary
- GlossML file format standard for glossaries
Termbases
Termbases can be very useful if your community uses a TM tool to localize your projects. The best part is that with the TBX format it's easy to exchange your termbases among team members, helping you to maintain standardization.
You can create and maintain termbases in two ways:
- In a text editor.
- Within your TM tool.
Those hardcore hackers will be most interested in manually creating their termbase using a text editor. Since TBX is XML-based, anyone who is familiar with XML will be able to take this approach to creating and maintaining their termbases. This document will provide you with the common syntax and structures necessary to create your termbase in a text editor.
For those communities without XML experts, your TM tool should include a utility for creating and editing a termbase in TBX. Refer to your TM tool's user manual to learn more.
When creating your termbase, take into account the same considerations as if you were simply creating a glossary, with these few exceptions:
- Determine what metadata is important to include in your entries.
- Should you add contextual examples in your entries, be creative!
Examples:
Translation memories
As with termbases, TMs are always used within TM tools and can be created and maintained within those tools. See your TM tool's user manual for details on using its TM utility.
There are three other noteworthy points when it comes to managing TMs:
- Only export your TMs once the editing phase is complete. Exporting TMs before can result in corrupted, low-quality target strings being re-used in future versions of your project.
- Export a new TM with each project iteration and store it in a central, version control system. This will ensure that, should the worst happen and you lose all of your previously translated strings to string corruption or some other disaster, you have back-ups from previous versions.
- TMs can be exported in as either TMX or PO files. TMX is the standard exchange format, contains the most metadata for you to track who added or modified entries and when, and is the most commonly used format. PO is a very reliable format used with GNU Gettext tools, but is not as common across TM tools.
Examples: