Identity/AttachedServices/Architecture

From MozillaWiki
Jump to navigation Jump to search

Overview

Profile In The Cloud (PICL) is a mechanism for synchronizing browser state between a user's various devices. The user attaches a given local profile to a remote account by "logging into their browser", which then uploads and downloads data as necessary to bring the local profile into harmony with the server-held data. Possible PICL services include: bookmarks/history/tabs/passwords backup/syncing, social API preferences, sharing providers, WebRTC bridge provider, file-storage service, etc.

Architectural Overview

There are roughly five areas of concern in the PICL system:

  • 1: Signup/Signin: How does the user attach a new device to their account? This area involves passwords, usernames, email addresses, recovery options, revocation, and device management.
  • 2: Conversion: How do we extract (and inject) data from the various native data sources (PlacesDB for bookmarks and history, the Password Manager, etc)? This data should be converted into a neutral format so the synchronization code doesn't need to know the details. This code must also merge conflicting data when necessary.
  • 3: Synchronization: the neutral data must be encrypted, signed, batched, and delivered to/from a storage server. This process must tolerate dropped messages, interrupted connections, overload conditions, and arbitrarily-long periods of server unreachability.
  • 4: Storage Server Authorization: The browser code must prove to the Storage Server that it has a right to read/write the encrypted records.
  • 5: Storage Server Format: The storage server must store large quantities of data reliably, and provide fast access.

Architecture Map

This document describes our current plans for these five areas. The https://id.etherpad.mozilla.org/picl-backend etherpad page contains links to other design documents.

Data Security

We are exploring various models for data security: https://wiki.mozilla.org/Identity/CryptoIdeas/03-ID-Attached-Data

The user will have a single "PICL Password", which they must type into the browser during the sign-in process. The user's browser proves (to the Key Server) that it knows this password. From this, it obtains data-encryption keys and a signed certificate that authorizes Storage Server reads and writes. No server learns this password directly: the closest they come is the Key Server, who receives a stretched "verifier" (which only enables a brute-force dictionary attack).

User data is stored in one of two categories. Anything put in the "Class-A" category can be recovered as long as the user can still access their email (i.e. get a Persona assertion for it from their IdP), and consequently is also technically retrievable by the operators of that IdP and the Key Server (or someone who compromises either). Data put in the "Class-B" category requires the PICL password to retrieve: it cannot be recovered when the password is forgotten, but (if the password is well-chosen) cannot be retrieved by the IdP or any other server-side attackers.

We do not yet know which data will be assigned to which category by default, but it is likely that saved-passwords will go into class-B, and many other datatypes will default to class-A. There will be an option to put all data into Class-B.

Sign-Up / Sign-In

Attaching a profile to an account is called "Sign In To The Browser". The UI for this is still under discussion, but will involve the user typing an email address and a password into chrome browser UI (for both new-account creation and signing into an existing account, as well as password reset). This password will be stretched on the client side (using techniques from Identity/CryptoIdeas/01-PBKDF-scrypt) and used to generate an "SRP password" and a wrapping key (using techniques from Identity/CryptoIdeas/02-Recoverable-Keywrapping).

The SRP Password is then used in a protocol (under development) to speak with the Key Server . SRP is an interactive "zero-knowledge" protocol which gives the participants exactly one chance to show that they agree on a password. The outcome of SRP is a random session key: if the password was correct, both sides will wind up with the same key (otherwise their keys will be different). This session key is used to protect and authenticate some additional messages, which are used to retrieve the class-A and class-B master data-encryption keys, and a "certificate renewal token". This token allows the browser to obtain a signed certificate for a special "PICL Account" identifier (e.g. GUID@picl.persona.org). These certificates will be used for Persona/BrowserID authentication to the storage servers (described below).

The class-B master key is encrypted by a derivative of the stretched user password. The master keys are then used to derive per-datatype encryption keys. We use different keys for each datatype so that in the future, we can share e.g. bookmarks with a third party (by telling them the decryption key) without also sharing e.g. stored-passwords.

The KeyServer/PiCL-IdP is a small server which holds a few values for each user: email, SRP verifier, and kA/wrapped(kB). This server also keeps track of which devices have been attached to the account (to help the user with device management and revocation).

If the user forgets their password, they can reset the account (and establish a new password) by providing a Persona assertion for their account's email address. The class-B data is deleted, but the class-A data is retained.

Conversion / Data Adapters

Synchronization

We've developed the Delta-Sync protocol for getting full sets of encrypted key-value records from browser to server and back again.

However we are currently (04-Jun-2013) investigating a scheme named "Queue-Sync" for uploading batched change records to the server and merging downstream records back into the local datastore. When compared to Delta-Sync, we expect Queue-Sync to:

  • avoid expensive full-dataset hashes to compute revision identifiers (but also gives up on some full-dataset integrity guarantees)
  • handle "re-sync" more naturally (which occurs at initial connection, and later when either server or browser falls behind)
  • avoid keeping a full shadow copy on the browser

Storage Server Authorization

The browser will speak Queue-Sync to the Storage Server. A Persona (BrowserID) assertion for the "PICL Account Identifier" (e.g. GUID@picl.persona.org) is what allows the browser to read and write their encrypted Queue-Sync records.

This assertion must be verified with the usual public-key signature checks and .well-known lookup process. For performance, the Storage Server will only verify it once, then exchange it for a token that is easier to validate (either a nonce that maps to the validated account identifier and expiration time, or an encrypted/HMACed copy of the session data). Subsequent requests will be authorized by the token.

An initial draft of the storage-server protocol is here.

Storage Server Format

(OLD) Authentication

Authentication to PICL Services is done via Persona. This means that a browser needs to be natively logged into Persona, so that it can generate the Persona assertions it needs to connect to individual services without user intervention every time the browser reconnects to an existing service. Specifically, if a PICL service runs at https://bookmarks.example.com, the browser gets an assertion for that audience, without prompting the user every time it needs one.

The flow for logging into the browser is more user-agent centric than the typica Persona signin-to-web flow. Redirecting to an IdP is too jarring. Thus, even if we allow different IdPs, the login UI must be consistent and feel like it's part of the browser.

These requirements (and the next Data Security section) call for a design where the browser locally captures the user's email and password, then engages in a protocol with the IdP – persona.org or otherwise – to obtain a certificate. One way to implement this immediately is to embed the invisible persona.org communication IFRAME and call into its internal API, which we recently augmented to include login() and accountExists() calls to support this implementation path.

Iframe-login-embedding.png