Support/l10nPriorityPRD
Overview
Localizers of SUMO lack a clear overview of the localization work that is needed and what is most important. We're currently doing a good job of explaining how to translate articles, but not which articles are most important. The purpose of the l10n priority milestone is:
- To provide a clear overview of the l10n work on SUMO
- To make the l10n work on SUMO feel less daunting by making it obvious where to start
- To answer the question: "which article is the most important to translate?"
- To define a baseline of what we define as a healthy status of a locale
Locale Health Baseline
In order to determine how well a locale is doing with its localization work on SUMO, we need a health baseline. This baseline, or threshold, also needs to make sense from both the users' and localizers' points of view and act as a motivator for everyone to work towards.
The metric that will be used to determine this threshold is: 50% of the total page hits on KB articles should be in the native language
This is best explained with an example:
- The en-US KB support article "Keyboard Shortcuts" has 13,000 page hits per week
- All en-US KB support articles have 530,000 page hits per week
- This means that the "Keyboard Shortcuts" article represents 2.45% of the total KB page hits for the en-US locale. If this article is translated into German, the German locale gets +2.45% added to its overall localization status.
- If the total % of all translated articles is >= 50%, that locale has reached its threshold and is considered to be doing well. It's not important which articles are translated, as long as the sum of all page hits is >= 50%, but obviously more visited articles have a higher impact, so we will recommend articles to translate based on traffic of the en-US locale.
- If the sum is smaller than 50%, the locale is not yet doing well, and may need assistance.
Page hits is chosen as the metric because it's a straightforward and easy to understand metric that can show how many people are affected by the translation.
However, this means that the order of the prioritized list of articles presented on the Localization Dashboard will not be the same as the list of articles that feature on the start page and call the Weekly common issues (also known as "Most Popular Support Articles"). That list is based on a scoring system taking things like search patterns and poll votes into account to estimate what kinds of problems people with Firefox are having based on what articles they're searching for and voting on. This score is not as intuitive for people to understand and would potentially confuse more than it motivates. Since the most popular support articles are featured on the front page of SUMO, they will in reality get high page hits too, so in reality, the difference between these two article lists should not be a real problem.
Mockup
The page consists of three main areas:
- Localization Priority -- A high-level overview of the l10n work that defines the baseline of what we call a healthy l10n status. Locales that complete the items in this list are considered to be performing well
- What's the most important article to translate next? -- A simple actionable item that makes it easy for a localizer to see what translation work has the highest impact in terms of the number of users that benefit from the work
- Article lists -- The actual lists of prioritized articles that need translation. Initially, we will maintain and present at least two lists: navigation pages, and the full Knowledge Base of support articles
Progress bars
- Header clickable link to anchor further down the page with the actual list
- Progress bar dynamically generated based on the status of a l10n priority list
Article lists
- Dynamically generated and insertable into a standard wiki page
- Status visually emphasized using colors
- Actionable links, making contributing straightforward
Requirements
- The status of an article can be one of the following:
- Not translated -- no translation of the article exists in the current locale
- Translated -- the article is already translated
- Needs review -- a translation has been made, but it has not yet been approved (the translated article exists in the staging area only)
- Articles should be stored in a list (tracker db?), containing the following info:
- Article name/link
- Priority order
- Score -- This is a way to define a more dynamic threshold than just "the top xx articles." In our case this would correspond to the page hits % the article has, so the score for the article in the example above would be 0.0245 (and our threshold would be 0.5). For other installations, the score might be based on something else, or might not be used at all.
- Ideally, this Localization Dashboard page would be a standard wiki page, but the generated parts of the page would be included somehow (e.g. using something like {l10nProgressBar(list=kb, scoreLimit=0.5)}, where 0.5 means 50% which is our threshold. The progress bar would then show the progress to reach that limit (and anything >= 0.5 would be a fully green progress bar).
- This would allow other locales to easily translate the dashboard while not messing up with the generated content. And obviously, for locales that are fine with using this dashboard in English, that would just work automatically using our already working l10n fallback mechanism. So, these people would see the descriptions etc in English, but the generated data would be based on their native locale.
- If we allow generated content be inserted using a syntax like {l10nTable(list, start, limit, scoreLimit, filter)}, that would give us a lot of flexibility in how we want to present the work.
- For example, if we wanted to provide a separate list of the 10 first articles waiting for review, we could use {l10nTable(list=kb, start=1, limit=10, filter=needsReview)}
- Another example, if we wanted to list the highest priority articles that need to be translated to reach our goal of 50% page hits, we would use: {l10nTable(list=kb, start=1, scoreLimit=0.5)}
- If we wanted to list all Navigation articles (start pages, etc), we would use: {l10nTable(list=navigation)}
- This way, we have the flexibility to change the way we present the information. The system should also be flexible enough to work well for other localizable TikiWiki installations. This would even allow locale leaders to customize their view based on what works best for them.
- Because the Weekly Common Issues list is calculated using a combination of TikiWiki data and web analytics software (in our case Omniture), we need the ability for an external system to "tell" TikiWiki the priority of the KB. Marc Laporte mentioned a tracker db; seems like a fine solution to me. So, a tracker would contain a list of articles in a specific order.
- We need to be able to put data in this tracker using e.g. a bash script or similar.
- Alternatively, we need a way for TikiWiki to import data from e.g. a CSV file to a tracker.
- In our case we would need three separate priority lists, so three trackers:
- One tracker for the support articles -- what we call the Knowledge Base (KB). This is the tracker that we would need to allow an external system (bash script or similar) to write to.
- One tracker for the "navigation pages" (start pages, and other pages that are essential to the navigation)
- One tracker for the "contributor documentation" -- pages that aren't meant to be used for end-users, but for contributors
- A function that can take one of these trackers as a parameter, the locale as another parameter, and return a formatted table (as shown in the mockup above) with the status of these articles.
- Another function with the same signature that returns the next untranslated article. If all articles are translated, nothing would be returned. If only a staging copy exists (the "Needs review" state), that article counts as untranslated, so it should be returned.
- Another function with the same signature that returns a summary (total number of articles, and number of translated articles).
- A function that can draw a progress bar based on the summary in the function above.