Firefox/Projects/Places DB Creation Scripts
Overview
Sprint lead: ddahl
Sprinters: adw
- Description
- Create python
/perlscripts to generate Places DBs with various characteristics such as "many visits within the same domain", "visits across many domains", "many tags", "many bookmarks", etc.
Goals / Use Cases
The sample data set should actually be quite huge (according to Beltzner and Shaver). We should collect stats from others with Dietrich's extension to see what the average data set looks like at Mozilla.
The chief goal is to be able to automate the generation of these sample sqlite databases for a continuous test to run on Places. We want to be able to reliably set some benchmarks and see what code changes either slow down or speed up queries in Places.
Non Goals
tbd
Design
We should try to use the Django ORM to reverse-engineer the Places database schema into Django Models so creating rows will be easy and we can concentrate on url data collection.
Data collection:
Beltzner envisions a huge dataset made up of perhaps 10k unique urls in bookmarks and a similar data set in history, etc...
We need to brainstorm a method for getting this raw data. Spider/bot? There are many python libs for this.
Bugs
tbd