- Brian Warner, 10-Sep-2012
For both categories, web content is given a way to encrypt user data for later retrieval by the same user on the same domain. Each (email, domain) tuple gets an independent encryption key. If web content stores only the ciphertext, and does not record the plaintext, then the user's data will remain confidential against all other users and against web content from other domains.
The user's access to this data is managed by an "account", which lives on an Account Server. We expect that this account server will also provide signed Persona certificates for BrowserID-based logins. This server will probably look a lot like the current persona.org server. Each account will have some kind of identifier (a username or master email address), a configured password, a set of "recovery email addresses" for use when the password is forgotten, and other stored data (described below).
The idea is that user agents with knowledge of the account ID and the password can get full control of the account: they get signed certificates, access to all wrapped keys, and can change the password. Users who forget their password but who can still receive email at the recovery address will be able to reset the password and regain control over most of the account (excluding the "Secure" wrapped key, described below).
"Recoverable" vs "Secure" Data
This proposal defines two categories of encrypted data:
- "Recoverable" data can be decrypted by anyone who controls the account. Using the password-reset process retains access to Recoverable data.
- "Secure" data can only be decrypted by someone who knows the account password. Using the password-reset process destroys access to the Secure data.
This allows users to decide between availability and confidentiality of their data, so they can put different data into different categories. For example, a Sync-like application could put bookmarks into the "Recoverable" category, but saved passwords and credit-card numbers into the "Secure" category.
The Recoverable key is stored in the clear on the account server (but only revealed to clients who demonstrate control over the account). The Secure key is wrapped by a derivation of the password, and only the wrapped form is stored on the account server, so the actual password is required to unwrap it. The raw password never leaves the client, nor does any non-stretched derivative.
The big-picture design, using SRP to protect the password-derived access keys, looks like this:
The password is sent through a key-stretching KDF (like the PBKDF/scrypt scheme described in 01-PBKDF-scrypt) and then HKDF to derive both the wrapping keys (PWK, MAC) and the server credentials (SRPpw). A setup step (which must be protected) delivers the SRP verifier to the server, after which all requests are protected by an SRP session. The two long-term user keys SUK and RUK (for Secure and Recoverable data respectively) are randomly generated during setup, the SUK is wrapped by PWK/MAC into "WSUK", then both WSUK and RUK are delivered to the server.
Later, inside an SRP-protected session (which proves control over the account and protects both request and response from attackers), clients can request the WSUK, RUK, or a signed BrowserID certificate for any of their verified email addresses. The password-derived PWK/MAC is used to unwrap the WSUK to get the SUK.
When the user changes their password, the client needs to re-wrap the SUK and then replace both the SRP verifier and the WSUK. The RUK remains unchanged. All encrypted data remains available.
When the user exercises the password-recovery-by-email feature, the client will generate a new SUK, and replace both the SRP verifier and the WSUK. (In practice, the client will propose a new SRPv/WSUK to the server, and the server will stash the proposed values in a "staging" database until the challenge email is answered). The Secure data is lost. The RUK remains the same, and the Recoverable data remains available.
If we're willing to rely on SSL (i.e. pinned certificates), then SRP is unnecessary, and we can simplify the protocol slightly:
In this case, the "S1" token is already salted (by the AccountID) and stretched (by the KDF), so there's no value to adding more of either. We store H(S1) instead of S1 so that a DB compromise does not yield directly exploitable access tokens (the attacker must first brute-force the password, then re-derive S1). An eavesdropper would trivially learn S1, but we're relying on pinned SSL to prevent that.
Access Reliance Sets, Attack Costs
The user (who remembers their AccountID and password) can read all the encrypted data, both Secure and Recoverable, from all domains. They use a BrowserID assertion to access the web site, then decrypt the data with a derived domain key.
The user who forgets their password, but retains control over their recovery email address, can still read the Recoverable data from all domains. They exercise the recovery process, establish a new password, retrieve the RUK, get an assertion, then decrypt the web site's stored Recoverable ciphertext.
The Account Server knows the bare RUK (otherwise it could not help users recover it with merely an email message), therefore anyone who compromises or extracts its secrets will be able to decrypt any Recoverable data. To get the ciphertext, the attacker would either have to steal it from the web site holding that ciphertext, or steal the Account Server's private BrowserID signing key and use it to forge an assertion (then ask the web site nicely like a normal user).
The Account Server (or someone who compromises it) can mount a brute-force attack to deduce the user's password (and thus access the Secure data), using one of the following as an oracle:
- the SRP verifier (or H(S1) in the non-SRP variant)
- the HMAC used as an integrity check on the encrypted WSUK
- a combination of the WSUK and an SDK (or a plaintext/ciphertext pair from any web site the user has logged into)
(we might omit the HMAC integrity check on WSUK, to avoid providing this oracle, in the hopes that SRP verifiers cost too much to create, and getting a plaintext/ciphertext pair is too hard. If we did this, corruption in the account server would not be detected until the user tried to decrypt data and failed)
The cost of this brute-force attack is equal to the cost of a single guess (i.e. the key-stretching work factor) times the size of the password space that must be searched (i.e. the entropy of the password distribution).
No other brute-force attacks are feasible. Since the SUK and RUK are randomly-generated, derivatives (like the SDK and RDK for a particular domain) cannot be used to brute-force the cross-domain SUK/RUK: attackers would have to test 2^256 potential keys. For the same reason, a plaintext/ciphertext pair (available to any web site the user logs into) is insufficient to brute-force anything. From the point-of-view of a web site (which does not know the Account Server's secrets), the derived keys are full-strength 256-bit random values.
The Account Server connection doesn't help the attacker either: an eavesdropper sees only the zero-knowledge-granting SRP conversation, or is blocked by a pinned SSL connection. (The SRP verifier must be protected in transit during setup, of course, and this is the weakest part of the system).
As a result, users who are willing to manage a strong password will get arbitrarily strong security for the encrypted data they put into the Secure category. The security of their Recoverable data will be limited by the security of the Account Server, and by the path traversed by email sent to their configured recovery address (any attacker who can read the recovery email, or cause the email to be delivered to a more-accessible mailbox, will be able to take over the account, but that won't help them learn the SUK or the user's password).
- Francois and I talked a bit about data migration: if a site moves to a new domain name, is it possible to bring the user's data along? I think it'd require an explicit authorization from the user on the old site, naming the new site, which doesn't sound very nice. Maybe some sort of .well-known on the old domain, to authorize new domains that should be allowed to get the same key? Tricky stuff. -warner 27-Sep-2012
- I'd really like to have an extension point that makes it easy for an addon to provide pairing-based no-password management of strong keys, like how Sync does it. The password-based scheme can be as strong as the password you're willing to manage, but we know most users won't use good passwords. A pairing-based scheme gives unconditional security despite user behavior, but doesn't offer password-based recovery or new-machine setup. We should enable addons to experiment with different approaches here. Specifically I'm thinking that an addon should be able to supply the "C" value in lieu of the password-based KDF. -warner 27-Sep-2012