Firefox/Projects/Places DB Creation Scripts

From MozillaWiki
< Firefox‎ | Projects
Revision as of 17:55, 25 February 2009 by Ddahl (talk | contribs) (→‎Design)
Jump to navigation Jump to search

Overview

Sprint lead: ddahl
Sprinters: adw

Description
Create python/perl scripts to generate Places DBs with various characteristics such as "many visits within the same domain", "visits across many domains", "many tags", "many bookmarks", etc.

Goals / Use Cases

The sample data set should actually be quite huge (according to Beltzner and Shaver). We should collect stats from others with Dietrich's extension to see what the average data set looks like at Mozilla.

The chief goal is to be able to automate the generation of these sample sqlite databases for a continuous test to run on Places. We want to be able to reliably set some benchmarks and see what code changes either slow down or speed up queries in Places.

Non Goals

tbd

Design

We should try to use the Django ORM to reverse-engineer the Places database schema into Django Models so creating rows will be easy and we can concentrate on url data collection.

Data collection:

Beltzner envisions a huge dataset made up of perhaps 10k unique urls in bookmarks and a similar data set in history, etc...

We need to brainstorm a method for getting this raw data. Spider/bot? There are many python libs for this.

Bugs

tbd