|
|
| Line 1: |
Line 1: |
| = Collusion Proposals =
| | #REDIRECT [[Collusion/Proposals]] |
| | |
| == Strawman Schemas ==
| |
| | |
| Notes for Data Modelling (drawn from notes from [https://etherpad.mozilla.org/collusionmeetup Collusion Meetup])
| |
| | |
| I think we will need to model both nodes and connections in order to query changes over time
| |
| | |
| Node:
| |
| url
| |
| name
| |
| icon
| |
| group_as (ex: YouTube may be group_as: 'google.com'), key of another node
| |
| group_because: (why is it grouped that way, i.e., manually, from a list of known sites, via whois...)
| |
| location: (gps coordinates, if known)
| |
|
| |
| Link:
| |
| accessed: Node
| |
| referred_from: Node
| |
| timestamp:
| |
| visited:
| |
| datatypes: (from header)
| |
| type: (actually result)
| |
| link_type: (cookie, other)
| |
| requestType: (ajax, img src, script src, iframe, ...)
| |
| | |
| == Strawman API ==
| |
| | |
| What kinds of things do we expect people to query?
| |
| | |
| What formats do we want to respond with (JSON only?)
| |
| | |
| What protections need to be built in? (How to prevent poisoning the well?)
| |
| | |
| How do we sanitize the data to prevent capturing personally identifying information?
| |
| | |
| == Infrastructure ==
| |
| | |
| How much can be built on top of Mozilla metrics infrastructure?
| |
| | |
| Is it worth using a graph database?
| |