Support/l10nPriorityPRD
Contents
Overview
Localizers of SUMO lack a clear overview of the localization work that is needed and what is most important. We're currently doing a good job of explaining how to translate articles, but not which articles are most important. The purpose of the l10n priority milestone is:
- To provide a clear overview of the l10n work on SUMO
- To make the l10n work on SUMO feel less daunting by making it obvious where to start
- To answer the question: "which article is the most important to translate?"
- To define a baseline of what we define as a healthy status of a locale
Locale Health Baseline
In order to determine how well a locale is doing with its localization work on SUMO, we need a health baseline. This baseline, or threshold, also needs to make sense from both the users' and localizers' points of view and act as a motivator for everyone to work towards.
The metric that will be used to determine this threshold is: 50% of the total page hits on KB articles should be in the native language
This is best explained with an example:
- The en-US KB support article "Keyboard Shortcuts" has 13,000 page hits per week
- All en-US KB support articles have 530,000 page hits per week
- This means that the "Keyboard Shortcuts" article represents 2.45% of the total KB page hits for the en-US locale. If this article is translated into German, the German locale gets +2.45% added to its overall localization status.
- If the total % of all translated articles is >= 50%, that locale has reached its threshold and is considered to be doing well. It's not important which articles are translated, as long as the sum of all page hits is >= 50%, but obviously more visited articles have a higher impact, so we will recommend articles to translate based on traffic of the en-US locale.
- If the sum is smaller than 50%, the locale is not yet doing well, and may need assistance.
Most Visited vs Most Popular Articles
Page hits is chosen as the metric because it's a straightforward and easy to understand metric that can show how many people are affected by the translation.
However, this means that the order of the prioritized list of articles presented on the Localization Dashboard will not be the same as the list of articles that feature on the start page and call the Weekly common issues (also known as "Most Popular Support Articles"). That list is based on a scoring system taking things like search patterns and poll votes into account to estimate what kinds of problems people with Firefox are having based on what articles they're searching for and voting on. This score is not as intuitive for people to understand and would potentially confuse more than it motivates. Since the most popular support articles are featured on the front page of SUMO, they will in reality get high page hits too, so in reality, the difference between these two article lists should not be a real problem.
Mockup
The page consists of three main areas:
- Localization Priority -- A high-level overview of the l10n work that defines the baseline of what we call a healthy l10n status. Locales that complete the items in this list are considered to be performing well
- What's the most important article to translate next? -- A simple actionable item that makes it easy for a localizer to see what translation work has the highest impact in terms of the number of users that benefit from the work
- Article lists -- The actual lists of prioritized articles that need translation. Initially, we will maintain and present at least two lists: navigation pages, and the full Knowledge Base of support articles
Progress bars
- Header clickable link to anchor further down the page with the actual list
- Progress bar dynamically generated based on the status of a l10n priority list
Article lists
- Dynamically generated and insertable into a standard wiki page
- Status visually emphasized using colors
- Actionable links, making contributing straightforward
We might also want to show the impact a translation has in terms of page hits, e.g.:
Requirements
- The status of an article can be one of the following:
- Not translated -- no translation of the article exists in the current locale
- Translated -- the article is already translated
- Needs review -- a translation has been made, but it has not yet been approved (the translated article exists in the staging area only)
- Articles should be stored in a list (tracker db?), containing the following info:
- Article name/link
- Priority order
- Score -- This is a way to define a more dynamic threshold than just "the top xx articles." In our case this would correspond to the page hits % the article has, so the score for the article in the example above would be 0.0245 (and our threshold would be 0.5). For other installations, the score might be based on something else, or might not be used at all.
- Ideally, this Localization Dashboard page would be a standard wiki page, but the generated parts of the page would be included somehow (e.g. using something like {l10nProgressBar(list=kb, scoreLimit=0.5)}, where 0.5 means 50% which is our threshold. The progress bar would then show the progress to reach that limit (and anything >= 0.5 would be a fully green progress bar).
- This would allow other locales to easily translate the dashboard while not messing up with the generated content. And obviously, for locales that are fine with using this dashboard in English, that would just work automatically using our already working l10n fallback mechanism. So, these people would see the descriptions etc in English, but the generated data would be based on their native locale.
- If we allow generated content be inserted using a syntax like {l10nTable(list, start, limit, scoreLimit, filter)}, that would give us a lot of flexibility in how we want to present the work.
- For example, if we wanted to provide a separate list of the 10 first articles waiting for review, we could use {l10nTable(list=kb, start=1, limit=10, filter=needsReview)}
- Another example, if we wanted to list the highest priority articles that need to be translated to reach our goal of 50% page hits, we would use: {l10nTable(list=kb, start=1, scoreLimit=0.5)}
- If we wanted to list all Navigation articles (start pages, etc), we would use: {l10nTable(list=navigation)}
- This way, we have the flexibility to change the way we present the information. The system should also be flexible enough to work well for other localizable TikiWiki installations. This would even allow locale leaders to customize their view based on what works best for them.
- Because the Weekly Common Issues list is calculated using a combination of TikiWiki data and web analytics software (in our case Omniture), we need the ability for an external system to "tell" TikiWiki the priority of the KB. Marc Laporte mentioned a tracker db; seems like a fine solution to me. So, a tracker would contain a list of articles in a specific order.
- We need to be able to put data in this tracker using e.g. a bash script or similar.
- Alternatively, we need a way for TikiWiki to import data from e.g. a CSV file to a tracker.
- In our case we would need three separate priority lists, so three trackers:
- One tracker for the support articles -- what we call the Knowledge Base (KB). This is the tracker that we would need to allow an external system (bash script or similar) to write to.
- One tracker for the "navigation pages" (start pages, and other pages that are essential to the navigation)
- One tracker for the "contributor documentation" -- pages that aren't meant to be used for end-users, but for contributors
- A function that can take one of these trackers as a parameter, the locale as another parameter, and return a formatted table (as shown in the mockup above) with the status of these articles.
- Another function with the same signature that returns the next untranslated article. If all articles are translated, nothing would be returned. If only a staging copy exists (the "Needs review" state), that article counts as untranslated, so it should be returned.
- Another function with the same signature that returns a summary (total number of articles, and number of translated articles).
- A function that can draw a progress bar based on the summary in the function above.
Implementation
- Mechanism to import article priority data into the TikiWiki db from an external (http/https?) location (bug 481304)
- Location should be a configurable option/pref, since we want this to be portable to other SUMO installations (not just Firefox Support)
- cww+zzxc can advice; already have scripts that parse data and create the prioritized article list -- just need to get them into Tiki.
- Decide how to store this data, e.g. Tracker tables, or new db tables (bug 481303)
- Need at least three article lists:
- The entire KB, prioritized on page hits
- The "navigation" articles -- start pages, Ask a Question
- The How to Contribute articles
- Most Popular Support articles, rated by popularity (not just page hits)
- Lists must contain info about:
- Article name or URL
- Score (e.g. page hits %, or whatever metrics we use to prioritize the list)
- Priority order (possibly implied in the Score field above)
- Need at least three article lists:
- TikiWiki plugin to generate article lists in wiki docs (bug 481305 and bug 481306)
- Flexible parameter based syntax just like the SHOWFOR plugin, e.g. {l10nTable(list=kb, start=1, limit=10, filter=needsReview)}
- Potential parameters:
- list: the actual article list (in the Tikiwiki db) to get the article info from
- start: (optional) a start index in the list. should start from the beginning of the list by default.
- limit: (optional) limits the number of articles to list
- scoreLimit: (optional) limits the number of articles to list based on the priority system we use. For example, in our case, we might want to show a list of the articles that are needed in order to reach 50% page hits. We can't just say limit=15 because we don't know if 15 articles are really enough or if it's actually 16, or even 20. Instead, we use scoreLimit=0.5 (assuming the sum of the score for all articles are 1.0)
- filter: it should be possible to filter the list on status. For example, we might want to show a list of only the articles that are translated and waiting for approval/review. Or, we might want to filter out already translated articles.
- style: the mockup above shows two possible styles: a table format with or without the Page Hits % as a column. A third possible style would be a simple HTML list with just the article name, which would be a perfect match for our "Most Popular Support Articles" list on the front page.
- TikiWiki plugin/syntax to generate progress bar with progress text/label (bug 481305 and bug 481307)
- Similar to the plugin above, this progress bar could be limited by the number of articles, or a scoreLimit. So, 100% in the progress bar could be the result of x number of articles in a list translated, or y total score achieved.
- The text label next to the progress bar should be flexible, as we haven't yet figured out which way is optimal for our "50% of total page hits localized" baseline. Some possible formatting options:
- "x/y articles translated" where y is calculated as the number of articles required to reach 50%
- xx% complete (x articles translated)
- "x articles remaining"