Support:Search Requirements

Revision as of 04:29, 5 December 2007 by Cilias (talk | contribs) (spelling)

THIS DOCUMENT IS A DRAFT OF OUR SEARCH ENGINE REQUIREMENTS FOR SUMO. IT IS NOT YET FINAL.


  1. Doesn't kill the server
  2. Handles spelling errors
  3. Category/tag based searches (only articles in a particular category and/or with the specified tags should be matched [1])
    • Does it depend on the user (e.g. show Staging copies to contributors?)
      • Yes, but the important thing is that the search engine accepts searches based on category/tags. Then we can use different search queries depending on user groups. (djst)
  4. Should only look at the content, title, and tags of the article and not other features of the page. Right now, searching for "Bookmarks" shows all articles because "Bookmarks" appears in the tag cloud.
  5. "Notice" new or changed content within 24 hours.
  6. Do not return multiple results for the same article:
    • Different capitalization [2]
    • Different request parameters
  7. Some formatting issues [3]
  8. Handle localization
    • How?
      • The locale should be detected (and possible to override).
      • When a search is performed, only the selected/detected locale should be searched. However, many locales will have incomplete translations, which means it would also list content not localized (using the same locale fallback mechanism as defined in [4])
      • In summary, a search should return all results for the current locale + any remaining articles in the fallback locales, but it should never list the same article twice, even if it exists for two locales.
  9. Be able to weight articles
    • Based on their tags
    • Based on their poll results
    • Based on their page hit count
  10. Handle tiki formatting correctly
    • Properly handle the use of it in search (search for "code" should not return all pages that use the code tag)
    • Don't display wiki source in search results
  11. Show statistics on the article
    • Show popularity and poll results in search results
  12. "More like this"?
    • I personally don't see the benefit (djst)