Suggesting Categories for Pages

From MozillaWiki
Jump to: navigation, search

Goal

Provide a web service that takes a URL and returns a list of categories for that page. These categories would be used to provide suggested topic names when the user pins a site.

A recipe for Pancakes on the Food Network website would for example return topics like: "Food", "Recipe" and "Pancakes".

Ideally these categories should also be cached or stored globally so that we can later start using these for other experiments like for example finding similar pages or suggesting pages.

Performance

Can we provide suggestions near real-time? Can we have these categories ready as soon as the user hits the Pin button and the Pin dialog appears? Do we have to?

Quality

Can we present the user with categories that make sense? Can we create a category suggestion service that is accurate?

Infrastructure

What would the server side part of this look like? Can we scale it up as the number of users and pages in Pancake grows?