CloudServices/Sync/ExtensionStorage Design Doc: Difference between revisions

CloudServices/Sync/ExtensionStorage Design Doc (view source)

Revision as of 19:19, 24 January 2017

3,455 bytes added , 24 January 2017

This is why we don't encrypt record keys

Glasserc

30

edits

@@ Line 14: / Line 14: @@
 When we sync, we map the local collection name to an "obfuscated" remote collection name. This is done so that metadata doesn't leak information about what extensions a user has installed.
+=== Encrypting records ===
+In chrome.storage.sync, each datum is a key-value pair. Keys can presumably be any string (for example, an extension might store a value ["yes", "I", "do"] under the key "I ♥ moz://a"). In Kinto, we represent this same datum as a JSON object like {"id": "key-I_20__2665__20_moz_3A__2F__2F_a", "key": "I ♥ moz://a", "data": ["yes", "I", "do"], "last_modified": 12345}. As stated above, this record is stored "in the clear" on the client. Note that we store the original key, as well as a Kinto-safe key that uses a reduced character set.
+When it's time to send this record to the server, it's encrypted using an EncryptionRemoteTransformer. The record is serialized to produce a plaintext. An IV is generated and is used in conjunction with the extension key (above) to produce a ciphertext. An HMAC is computed over the record ID, IV, and ciphertext. The ID and last_modified fields are copied over from the cleartext record so that syncing can work correctly. The encrypted record will then look like {"id": "key-I_20__2665__20_moz_3A__2F__2F_a", "ciphertext": "[some gibberish]", "IV": "[some gibberish]", "hmac": "[some gibberish]", "last_modified": 12345}.
+When the server provides this record to a client, it decrypts it in the usual way -- verifying the HMAC first, and then using the IV and the extension key to decrypt the ciphertext, producing a serialized record, which is then used as the real record.
+This approach currently leaks metadata -- specifically, information about record identity, which can itself be valuable or allow an attacker to infer what extension is being used. This information would be accessible to anyone who had access to the Kinto database or an FXA token for the user. (Only having an FXA token wouldn't be enough to decrypt the data itself, since you'd still need kB.)
+Is it possible to hash the record IDs, so that we don't leak data in this way? If we do this, then we need to store the "true" record ID somewhere, for example in the ciphertext, so that when we get records from the server, we can store them under their true ID so that the extension can access them. However, if one client deletes the record from the server, the server stops serving the complete record body -- instead it just serves a "tombstone", which contains nothing but the (hashed) ID. When we get one of these from the server, it's impossible for us to figure out what local record to delete, so syncing will break. In order to hash record IDs, we would have to modify Kinto to store deleted records forever, and serve them, rather than just tombstones.
+How about using encryption to encrypt the record IDs in a reversible way? If we do this, we have to decide what keys to use to encrypt record IDs. We have to be careful with these keys, since if they ever change, we have to rename every record in the collection. Once we have keys that we can use, we'll have to decide what to do with the IVs that we used for each record. Because we can't afford to lose them either, we'd have to embed them in the Kinto record ID somehow too. Finally, once we've surmounted these obstacles, we find that we've opened ourselves up to known-plaintext attacks. Since the universe of webextensions is relatively small (a few thousand), it isn't that difficult to figure out what keys are in use by which extensions, each of which is an attack vector. This seems like a lot of complexity for not a lot of security.
 === Password changes ===