Identity/CryptoIdeas/03-ID-Attached-Data: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 1: Line 1:
== ID-Attached Data ==
== ID-Attached Data ==


* Brian Warner, 05-Feb-2013
* Brian Warner, Chris Karlof, 05-Feb-2013


Summary: a design to extend the ideas in [[BrowserID Key Wrapping]] and
Summary: a design to extend the ideas in [[BrowserID Key Wrapping]] and

Revision as of 03:33, 6 February 2013

ID-Attached Data

  • Brian Warner, Chris Karlof, 05-Feb-2013

Summary: a design to extend the ideas in BrowserID Key Wrapping and Identity/CryptoIdeas/02-Recoverable-Keywrapping to include three classes of data (recoverable-by-assertion, recoverable-by-password, paired-device-key-only). A new "Key Server" is introduced, which holds (wrapped) encryption keys for the user, so Storage Servers only hold ciphertext encrypted with a full-strength per-user per-service key. Clients who wish to share data with storage servers can either reveal their per-service key, or store plaintext. Revocation is discussed.

Data Protection Classes

We describe three classes of data, providing a spectrum of confidentiality and recoverability. All data is encrypted with a full-strength key, but this key is made available (or not) to different people depending upon which class the data is in:

  • class-A "available": this key is available to anyone who can produce a valid BrowserID assertion, so includes the end user, their IdP, and anyone who can spoof whatever the IdP uses to identify the user (typically an email challenge, so this includes network attackers who can redirect or snoop on SMTP traffic). Any user who can still convince their IdP to let them log in will be able to recover their class-A data, without remembering any other secrets or having any devices remaining.
  • class-B "brute-forceable": this key is wrapped with a (derivative of a) user-memorized "master password". A limited set of parties (anyone who can read the class-A data) will have enough information to attempt a brute-force attack on the password, but storage servers and the rest of the world will not. The user must remember their master password and be able to log into their IdP to get this data back.
  • class-C "confidential": this key is created by the user's first client, and transferred to their other clients with a pairing protocol (PAKE), facilitated by a central server but not vulnerable to it. No external parties will be able to read this data. The user must have at least one functional paired device (or a manual key backup) to recover this data.

The idea is that users can choose which browser data goes into each class. Sensible defaults would probably put Password Manager data into class-B, and bookmarks into class-A, but users should have the option of putting everything into class-C if they like (to behave like current FF Sync).

User Options

If users never want to use class-B data, they should not be required to come up with a master password. If they never use class-C data, they should not be required to pair new devices with existing ones.

The system should provide some way to revoke access from stolen devices. This revocation may not be immediate.

The user should be able to grant access to subsets of encrypted data to any service they please, without also granting access to all their encrypted data.

Big Picture

Web content from various domains, as well as internal services (addons with synthetic "resource://" domains) will get an API (maybe "navigator.id.data"?) with which they can obtain keys, tokens, encrypt plaintext, and decrypt ciphertext. Web content can only obtain keys/tokens when the user has signed into that domain with BrowserID (to prevent unauthorized linkability). The API provides separate keys/tokens for the three classes of data (kA, kB, kC).

Big Picture

When the API is used for the first time, the browser needs to create an account or connect to an existing account. It prompts the user to provide an email address and obtains a BrowserID certificate for that address. It then presents an assertion to the Key Server to see if the account already exists, creating it if necessary. The Key Server will create a random "kA" and return it to the browser (this message is only protected by TLS, as we have no other shared secrets to work with). Subsequent accesses will use the same process: submit an assertion, get back "kA".

Setup With Assertion

If class-B keys are desired, the browser will ask the user for a master password. This password will be stretched using the PBKDF2/scrypt/PBKDF2 scheme defined in Identity/CryptoIdeas/01-PBKDF-scrypt to obtain a "master key". This key is then used to derive some additional keys, which are used in a "key retrieval" step (probably using SRP) to safely obtain a wrapped copy of "kB", which is then unwrapped with a different derivative of the master key. The key retrieval step can use shared data to prevent eavesdroppers (even those who break TLS) from learning anything about the password or kB.

Setup With Password

For each class of data, the API provides both a raw encryption key "kA[domain]", and a "tokenA[domain]", both of which are scoped to the user and the domain which served the code that invoked the API. The recommended way to manage the ciphertext is as follows:

  • deliver a BrowserID Assertion and the token to the storage server
  • the storage server records a database row with the assertion's email address, the token, and a slot where ciphertext will be stored
  • discard the assertion. The API retains kA/kB/kC and thus the ability to regenerate the token and encryption keys.

Provisioning

Later, when the browser code wants to fetch or modify the data, it submits the retained token to the storage server to prove its right to access that data, and encrypts or decrypts the data with the key. The assertion (which has long-since expired) is not needed to access the storage-server data; the token is sufficient.

Use

Storage Servers distinguish between data stored/encrypted under different classes: they should not confuse class-A data with class-B data. If it makes sense to allow multiple classes in a single service (perhaps the service holds bookmark data, and the user can choose which class they want), the server API will need to be able to list or discover what data is available. It's probably best to require a class-A token to make this query, to protect user privacy.

Storage Servers should reset their tokens, or delete user data entirely, upon receipt of a valid BrowserID Assertion with a matching email address. This allows less-verified parties to delete user data, but also allows users to regain control of their data (or delete it) even after they forget their password or lose their paired device keys.

No Key Server Certificates

The Key Server does not issue certificates. Storage Servers rely on tokens (derived from browser keys which can be held for long periods of time) for most access, and accept IdP-derived BrowserID certificates for setup, recovery, and teardown. This keeps the Key Server small (no public keys to publish and rotate, fewer asymmetric crypto operations, less confusion about expiration times which wouldn't match the IdP's choices). It also affects revocation.

IdP certificates have expiration times under the control of the IdP, and are likely to expire well before a background-data-synchronization tool would want them to (requiring users to re-sign-in every few weeks would be a drag). So IdP certs aren't a good choice for storage-server access control. Using Key Server certs instead would allow us to have much longer-lived certificates, but it sounds like "forever until revoked" is the desired lifetime for this component. So using tokens instead of certificates is simpler and achieves the same goals.

Revocation

When an activated device is lost or stolen, users will want to revoke its access, and are likely to express this by changing their master password (if any), and/or by going to a Key Server control panel of some sort and hitting a "revoke devices" button. This will be implemented by changing kA/kB/kC, leaving the user to update their remaining devices with the new credentials. The Key Server will require a BrowserID assertion when performing this step.

Browsers are expected to discard MK (the password-derived master key) immediately after obtaining kB, to ensure that nothing is left in the browser that could let an attacker brute-force the master password without online help. But they will retain kA/kB/kC for a while, so that periodic data sync can continue to occur in the background without user intervention.

Browsers should check in with the Key Server every once in a while, and if they learn that kA has changed, they should immediately delete kA/kB/kC. Application code which uses the key-access API should not retain these tokens and keys, but instead allow the API to regenerate them upon request. This check should probably be fail-open (if the check cannot be made, retain the keys).

Uncooperative browsers (stolen devices which are not given the opportunity to erase their memory) will still have these keys, and it'd be nice to revoke them. When the revoking client tells the Key Server to change kA/kB, it should also speak to all known Storage Servers (using either an assertion or the old token), to replace the access tokens with new ones, and to re-encrypt the data with the new key. There are fault conditions in which the client will not be able to use the old token (e.g. if a storage server is offline while the transition occurs, or the client crashes during the process), in which case it should use an assertion to replace the tokens and then delete the old encrypted data.

Using server-data-replacement for revocation isn't the most pleasant scheme. One alternative would have the Key Server issue certs which expire after some limited time, but which can be renewed by a token retained in the browser. When a client tells the Key Server to revoke other devices, this token is deleted, and the old device will eventually lose the ability to create certs that will be accepted by Storage Servers. However uncooperative devices will still know valid kA/kB (so they could collude with a storage server or IdP to obtain ciphertext), which can only be handled with re-encryption. It would also require the Key Server to publish a public key (just like primary IdPs would), which would be a centralized attack target, and would require Storage Servers do frequent public-key crypto operations (instead of cheap token checks). However the similarities between a cert-issuing Key Server and a primary IdP (or the Persona secondary/fallback IdP itself) are worth exploring.

Pairing

TBD. Basic idea is to use Sync's JPAKE scheme, or the more modern SPEKE2/PAKE2 protocol, with the Key Server providing the rendezvous point, but require an email address and BrowserID assertion. Since the email uniquely identifies the channel, we can use a shorter code (perhaps 4 characters). In fact, since the assertion protects the channel against arbitrary attackers, we can probably use a short PIN (three or four digits): the only attackers are the user's IdP and the Key Server itself.

The Key Server also assists with device management: it can tell the user which devices have reported completing the pairing process, and which are in-progress. It can also avoid the situation where two devices both think they are the first to use class-C data and thus both responsible for creating kC (creating incompatible keys).