Collusion/Proposals

From MozillaWiki
Jump to: navigation, search

Collusion Proposals

Strawman Schemas

Notes for Data Modelling (drawn from notes from Collusion Meetup)

I think we will need to model both nodes and connections in order to query changes over time

Node:

   url
   name
   icon
   group_as (ex: YouTube may be group_as: 'google.com'), key of another node
   group_because: (why is it grouped that way, i.e., manually, from a list of known sites, via whois...)
   location: (gps coordinates, if known)
   

Link:

   accessed: Node
   referred_from: Node
   timestamp:
   visited: 
   datatypes: (from header)
   type: (actually result)
   link_type: (cookie, other)
   requestType: (ajax, img src, script src, iframe, ...)

Strawman API

What kinds of things do we expect people to query?

What formats do we want to respond with (JSON only?)

What protections need to be built in? (How to prevent poisoning the well?)

How do we sanitize the data to prevent capturing personally identifying information?

Infrastructure

How much can be built on top of Mozilla metrics infrastructure?

Is it worth using a graph database?