Services/Sync/KeyRetrieval: Difference between revisions

From MozillaWiki
< Services‎ | Sync
Jump to navigation Jump to search
Line 108: Line 108:
Open questions:
Open questions:
* is there something better than PBKDF2 for this purpose?
* is there something better than PBKDF2 for this purpose?
* should we include the number of iterations in the stored PBKDF2 parameters? Someday we might want to increase it.
* should we include the number of iterations in the stored PBKDF2 parameters?
* should we mix the HMAC_INPUT string into the PBKDF2 inputs?
* should we mix the HMAC_INPUT string into the PBKDF2 inputs?



Revision as of 03:19, 27 September 2011

Draft-template-image.png THIS PAGE IS A WORKING DRAFT Pencil-emoji U270F-gray.png
The page may be difficult to navigate, and some information on its subject might be incomplete and/or evolving rapidly.
If you have any questions or ideas, please add them as a new topic on the discussion page.

Goal

To allow a user to securely retrieve their sync key using only the username and password for their Mozilla Services account.

To stop showing scary hex character strings in the Sync UI.

Overview

Currently the sync key is never stored on Mozilla servers in any form; it only exists locally on each device connected to the sync account, plus in any backups explicitly made by users.

This provides some additional security for the user, but it comes at a cost. If the user accidentally deletes or loses their sync key, they permanently lose access to their synced data. Moreover, setting up a new device means transferring the sync key from an existing device by either:

  • using J-PAKE to establish an encrypted channel, or
  • manually copying the sync key character-by-character

Since both of these methods involve the user dealing directly with randomly-generated hex strings, they can be confusing or intimidating for many users.

If the user *opts in* to the key retrieval service, then their sync key will be encrypted using their account password and stored on Mozilla servers. Barring our deliberate snooping or cracking of the user's password, this means that the sync key cannot be read by Mozilla.

When setting up a new device, the sync key can be retrieved and decrypted with the user's account username and password, making for a much simpler UI at the cost of slightly decreased security.

If the user forgets or resets their password then the stored sync key will be unreadable and must be uploaded again from a connected device. This is a feature - even if an attacker compromises their email and resets their password to gain control of their account, the attacker will not gain access their existing sync data.

Since this scheme reduces the security of all the user's sync data to the security of their account password, it will be a completely opt-in service and will be disabled by default.

The encrypted sync key represents a particularly high-value target for an attacker, because:

  • it potentially allows access to *all* of the user's sync data, and
  • it will be encrypted using a relatively low-entropy key (the user's account password)

We therefore entrust its storage to a separate service from the main sync-storage service, so that it can be run from a high-security server.

Details

Server API

Since the server component is intended to run from a high-security restricted-access environment, it should be as simple and light-weight as possible. Hence, we provide the smallest and simplest API the could possibly work: you get a single blob of plain text data, keyed by your username, limited to 1 KB.

   GET https://retrieval-server-url/username
   => 404 Not Found
   PUT https://retrieval-server-url/username
   Content-Length: 11
   hello world
   => 201 No Content
   GET https://retrieval-server-url/username
   => 200 OK
      Content-Length: 11
      Content-Type: text/plain
      hello world
   PUT https://retrieval-server-url/username
   Content-Length: 2048
   <mwuahahaha I store my warez on you>
   => 413 Request Entity Too Large
   DELETE https://retrieval-server-url/username
   => 201 No Content
   GET https://retrieval-server-url/username
   => 404 Not Found

It's tempting to expand this API into something more generic, e.g. to provide multiple different keys for each username. But the less this service has to do, the less chance there is for something to go wrong.

If we can successfully bootstrap from the user's password into a strong crypto key, then anything else they might need to keep safe can be stored in standard sync storage with strong encryption.


Sync Key Encryption

Before uploading to the service, the client encrypts the sync key using its existing standard encryption routines. The encryption key is derived from the username and password using PBKDF2. The details that follow are just to explain the process - in the client code this should be a thin layer on top of existing methods such as Utils.deriveKeyFromPassphrase and CryptoWrapper.encrypt.

To encrypt the sync key for storage in the retrieval service, the client uses PBKDF2 to derive an appropriate encryption key from the user's account username and password:

   salt = get_random_bytes(16)
   enc_key = PBKDF2(username + password, salt, 4096, 32)

This is then used to encrypt the sync key via AES-256, with a random IV and HMAC-SHA256:

   IV = get_random_bytes(32)
   ciphertext = AES-256-ENCRYPT(enc_key, IV, sync_key)
   hmac = HMAC-SHA256(enc_key, ciphertext)

The information necessary to decrypt the sync key is serialized into a JSON structure, which is sent to the key-retrieval service for storage:

   data = { 
     //  Parameters for key derivation, as used by deriveKeyFromPassphrase
     "salt":  "b64-encoded salt",
    
     //  Encrypted payload, same format as CryptoWrapper WBO output
     "ciphertext": "b64-encoded ciphertext",
     "IV": "b64-encoded IV",
     "hmac": "hex-encoded hmac",
   }
   HTTP.PUT(retrieval_url, JSON.stringify(data))

To retrieve the sync key, the client fetches the above JSON from the server and does:

   data = JSON.parse(HTTP.GET(retrieval_url))
   enc_key = PBKDF2(username + password, data["salt"], 4096, 32)
   if HMAC-SHA256(enc_key, data["ciphertext"]) != data["hmac"]:
       ABORT!
   sync_key = AES-256-DECRYPT(enc_key, data["IV"], data["ciphertext"])


Open questions:

  • is there something better than PBKDF2 for this purpose?
  • should we include the number of iterations in the stored PBKDF2 parameters?
  • should we mix the HMAC_INPUT string into the PBKDF2 inputs?

Authentication

Anyone who can access the stored key-retrieval data for a user can run a dictionary or brute-force attack against their password. So, we should only allow retrieval when authenticated as the user.

Since this service effectively reduces the security of the user's sync data to the security of their account password, we also need to consider the wider implications for password management across all Services products. Any breach in the auth system is an automatic breach in the key-retrieval system. For example, instead of stealing the encrypted key-retrieval data and trying to brute-force it, an attacker could steal the password database, dictionary-attack the weakly-hashed passwords, then use them to retrieve the key directly.

So if this key-retrieval service lives in a high-security cage or other special server environment, then the auth system should also live there.

Account management and authorization in Services currently uses HTTP-Basic-Auth, and hence transmits the password to Mozilla in the clear. Thus, users of the retrieval service are trivially vulnerable to us snooping on them, or to anyone who manages to compromise any of our servers. That's bad.

Ideally we would move to a system that can provide authentication without the server learning the user's password. See Services/Sync/SecureAuthentication for a proposal. Such a move will have to happen across the whole services infrastructure to be worthwhile.

Since the key-retrieval server component is intended to run from a high-security restricted-access environment, it should be as simple and light-weight as possible. It therefore requires that all calls be authenticated with the "Token Auth" credentials defined in the above proposal. Such tokens can be verified locally without having to call into the authentication API.

Keeping Things in Sync

The client will need to have some protocol for updating the stored key-retrieval data when the user changes their password, or for detecting when the stored data is stale due to a password reset. Will this fit into the existing "uh-oh your password seems to have changed" workflow on the client?

We could *try* to manage some of that automatically on the server but that sounds like trouble. Perhaps we need the ability for the account-management service to forcibly delete stored data when it knows a user's password has been changed.

This could be as simple as having a function that sanity-checks the stored data, and if it's not usable then it uploads a fresh version. Call this function periodically just to check on things, and call it as part of the your-password-has-changed workflow to fix things up explicitly.