Privacy/Reviews/Firefox Home

From MozillaWiki
Jump to: navigation, search

Document Overview

Feature/Product: Firefox Home
Projected Feature Freeze Date: Cancelled
Product Champions: -
Privacy Champions: Sid Stamm
Security Contact: Michael Coates
Document State: [RESOLVED] Obsolete -- project with this design dropped


Timeline:

Architectural Overview: 27-April-2011 (crypto proxy) TBD (home server)
Recommendation Meeting: cancelled
Wrap-up Meeting: cancelled

Architecture

In this section, the product's architecture is described. Any individual components or actors are identified, their "knowledge" or what data they store is identified, and data flow between components and external entities is described.

The main objective of this feature/product is: (describe the goals of the feature/product here)

Design Documents: Link to any design or architectural documents here.

Feature Pages:

Components

Describe any major components in the system and how they interact. Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.

HomeDigraph.png

Crypto Proxy

This component connects to your sync account and acts (as a sync client) as a proxy to decrypt your data. Home/Features/crypto/proxy

The tables below simply summarize the data encountered by this component.

Stored Data:

What Where
usernames + sync auth tokens (for accessing users' data) server's db?


Communication with Sync Client (Firefox)

Direction Message Data Notes
In: createAccount() username Called by sync client when users elect to enable web access
Out: createAccount() return access token token for obtaining user's key for tab/bookmark/history collections sent to sync client (given to home)


Communication with Sync Server

Direction Message Data Notes
In: sync() return encrypted tabs/bookmarks/history Called to get access to user's sync data
Out: sync() call access token + username Called to obtain access to encrypted data (which will be decrypted and sent to Home Server)


Communication with Home Server

Direction Message Data Notes
In: sync() call username + access token called by home to obtain user's sync data
Out: sync() return decrypted data user's unencrypted sync data

Home Web Servers

We will have stateless web servers that run the Home web application. These are standard web servers running Apache or NGINX to serve the Home web application.

These servers will likely be load balanced by Zeus.

These servers are supposed to be stateless so no data will stored on these servers. However, they might have sensitive configuration settings stored on them. For example things like web service keys or tokens that we need to connect to third party services. These are not user specific but instead are for the Home application.

(These external services have not been identified yet, but think about services like bit.ly.)

The tables below simply summarize the data encountered by this component.

Stored Data:

None. Except probably configuration data.

Communication with MemCache Server

Direction Message Data Notes
In: Get Web Session The Web Session object.
Out: Put Web Session The Web Session object.

Communication with Home Database Servers

Direction Message Data Notes
Select Get the user's (summarized) sync data
Out: Insert/Update User's Web App Settings/Prefs -

Home Database Servers

User data will sharded over a number of database servers. We will use a simple hashing mechanism so that we can determine where a user's data lives based on for example their username.

Each database will contain a plaintext version of the user's sync data. Initially that means bookmarks, history and tabs. The data will be normalized and properly indexed a bit more than it currently is in the Sync Servers so that it is easier to query for things.

All data for all users will be stored in a single database. This means that all records have a unique username or userid field to connect them to a specific user. Queries will have to be properly constructed to follow this.

(We can probably also switch to one database per user which will mean that there a more logical separation between user's data. However that does not rule out bugs in the front-end code to expose other user's data of course.)

One thing we will probably do is run some queries offline. For example we can periodically 'calculate' a list of your top sites and store that in a database table too.

The tables below simply summarize the data encountered by this component.

Stored Data:

What Where
User's Bookmarks (Sync Data) MySQL Database
User's History (Sync Data) MySQL Database
User's Tabs (Sync Data) MySQL Database
User specific settings/prefs for Firefox Home MySQL Database
User access token for the Crypto Proxy MySQL Database

Communication with other components or services

The database servers will periodically run a job to schedule a Sync operation for those users that are active users of Firefox Home. These jobs are submitted to a RabbitMQ server and picked up by the 'Syncer' component. These tasks only contain the username to be synced.

(Idea: Many users try out a new service and then forget about it. We could proactively delete user's data when they do not use Firefox Home for a certain period of time. Note that in the first couple of releases of Home there will not be any user generated data, just a copy of your existing Sync Data. So this is less scary than it sounds.)

Home Memcache Servers

The memcache servers are used to cache frequently used data to make the web app as responsive as possible. Initially just Web Application session objects are stored in memcache. These sessions are Python objects that contain user specific cached data.

(Not sure what will actually be in there. Possibly fragments of JSON or HTML or lists of things that we generate from your bokomarks & history)

Stored Data:

What Where
Web Application Session MemCache Server

Home Syncer

The 'Syncer' is a component that implements a sync client. It listens to a RabbitMQ queue to grab sync tasks and runs sync sessions.

The task that it gets from RabbitMQ contain just the username. This means that the Syncer will have to access the Home Database Servers to obtain the access token for the Crypto Proxy.

It will then run a sync session for the specific user against the Sync Proxy and store the synced data (bookmarks, tabs, history) in the Home Database Servers,

Stored Data:

None

Communication with Home Database Servers

Direction Message Data Notes
Select Get User's Proxy Access Token (Access Token)
Insert/Update Update the Sync Data (Bookmarks, History, Tabs)

Communication with Crypto Proxy

Direction Message Data Notes
In: sync() return unencrypted tabs/bookmarks/history Called to get access to user's sync data
Out: sync() call access token + username Called to obtain access to sync data

User Data Risk Minimization

In this section, the privacy champion will identify areas of user data risk and recommendations for minimizing the risk.

Alignment with Privacy Operating Principles

In this section, the privacy champion will identify how the feature lines up with Mozilla's privacy operating principles.

See Also: Privacy/Roadmap_2011#Operating_Principles:

Principle: Transparency / No Surprises: (How the feature addresses this)

Recommendations: (what can be improved)


Principle: Real Choice:

Recommendations:


Principle: Sensible Defaults:

Recommendations:


Principle: Limited Data:

Recommendations:


Follow-up Tasks and tracking

What Who Bug Details
[DONE] Initial Overview Discussion Stuart, rnewman, Stefan, Sid, Alex, secteam, infrasec Meeting: 26-April-2011
[ON TRACK] Finish documenting system, produce recommendations Sid, Home Team, Privacy In progress
[NEW] Discuss privacy recommendations Home team + Privacy Meeting time TBD