Changes

Jump to: navigation, search

Services/Shavar

3,998 bytes added, 22:03, 6 May 2015
Created page with "== Shavar, the mighty! == Contrary to popular belief Shavar is not the name of a mini-boss in the latest World of Warcraft expansion. It is Mozilla's service that speaks the..."
== Shavar, the mighty! ==

Contrary to popular belief Shavar is not the name of a mini-boss in the latest World of Warcraft expansion.

It is Mozilla's service that speaks the [https://developers.google.com/safe-browsing/developers_guide_v2 Safe Browsing](henceforth '''SB''') wire protocol for dynamic updates of simple data sets. Originally designed for phishing protection the protocol was co-opted so that the [[Security/Tracking_protection|tracking protection]] project would be able to publish larger data sets without incurring large bandwidth usage for mobile clients.

This page is intended for client side developers and others who need to interact with the service at a programmatic level.

==== List names and types ====

===== Names =====

In SB a list's name has to have a particular structure.

<organizational identifier>-<list name>-<list type>

;organization identifier
: a very short string identifying the organization publishing the data. This should be "moz" in every foreseeable use case.
;list name
: a string just long enough and descriptive enough to prevent collision with any other project's or group's data. "tracking" is currently in use for the [[Security/Tracking_protection|tracking protection]] data. "list" or "data" are very bad ideas.
;list type
: one of two types at the time of writing: '''shavar''' and '''sha256'''. More on the types below.

===== List types =====

The two different types of lists currently supported are named '''shavar''' and '''sha256'''. While both publish hashes of the actual data sets they do so in slightly different ways. shavar lists use the hash prefix style of publication described in the [https://developers.google.com/safe-browsing/developers_guide_v2 Safe Browsing] protocol specification while sha256 lists publish the entire hash(all 32 bytes) rather than just the first 4 bytes of a hash. As a result, any list published in a shavar format has to "phone home" to the service to fetch an entire hash.

==== How to Publish a new data set via the Safe Browsing protocol at Mozilla ====

The shavar service requires that data to be published be accessible via a git repo. These are the basics of setting up a new repository.

===== The repository =====

1. Create a new github repository for your list. Best practice would be to leave it completely empty. Make note of the ssh URL for the repository.
2. Grab a copy of shavar for the script used to populate an empty repository
clone https://github.com/mozilla-services/shavar
3. Create a virtual environment so we don't modify things on your machine permanently
virtualenv .
. bin/activate
4. Download all the necessary dependencies
python setup.py develop
5. Run the script that will create the skeleton of a new list's repository. Chances are very good that you have no need to deviate from the defaults.
python scripts/mknewlist <name of the new list> \
[shavar or sha256(shavar by default)] \
[organizational identifier prefix("moz" by default)] \
[-d path for the local working copy of the repository(data/<list name>)]
6. You can now stop using the virtual environment
deactivate
7. Populate the repository and push it to the master copy
cd data/<list name>
git remote add origin <ssh URL for the repository from step 0>
git add <list name>.txt
git commit -m 'Initial data commit'
git push origin master

By default, the input file name is <list name>.txt and is expected to contain one URL per line in the file. Populate this file as desired. If another filename is preferred, update publish.ini.

8. If you chose to create the list's local repository somewhere outside of the shavar directory tree, you can now delete the entire shavar repository.

9. Open a new bug in Bugzilla under Mozilla Services -> Operations in requesting that the new list repository be added to the publishing schedule. Make certain to include the list name and the URL for the new list's respository.
Confirm
65
edits

Navigation menu