Identity/CryptoIdeas/03-ID-Attached-Data: Difference between revisions

no edit summary
No edit summary
 
(7 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{LastUpdated}}
== ID-Attached Data ==
== ID-Attached Data ==


Line 27: Line 29:
bookmarks into class-A, but users should have the option of putting
bookmarks into class-A, but users should have the option of putting
everything into class-C if they like (to behave like current FF Sync).
everything into class-C if they like (to behave like current FF Sync).
These classes can be subdivided for other properties. For example, class-A
can be split into "A+" in which the data is encrypted by the
assertion-protected key before it is sent to the storage server, versus "A-"
in which the data is given to storage servers in the clear, and the server
only provides access to readers who present an assertion (or equivalent). In
both cases, the end user can recover their data with just an assertion. In
A+, the server doesn't see plaintext, so the user's reliance set (the list of
parties who can see the user's data) includes just the IdP and the Keyserver.
In A-, the storage server can manipulate the plaintext (perhaps to provide
merge/reconcilliation, or search features), in exchange for which the
reliance set grows to include the storage server. "A-" can also be
accomplished on a user-by-user basis by delivering a decryption key to the
storage server.


== User Options ==
== User Options ==
Line 69: Line 85:
used in a "key retrieval" step (probably using SRP) to safely obtain a
used in a "key retrieval" step (probably using SRP) to safely obtain a
wrapped copy of "kB", which is then unwrapped with a different derivative of
wrapped copy of "kB", which is then unwrapped with a different derivative of
the master key. The key retrieval step can use shared data to prevent
the master key. The key retrieval step can use the shared session key to
eavesdroppers (even those who break TLS) from learning anything about the
prevent eavesdroppers (even those who break TLS) from learning anything about
password or kB.
the password or kB. This kB is a full-strength random key, created on the
client, and never revealed (except in wrapped form) to the Key Server. As a
result, class-B data is fully protected against everyone but the Key Server,
and even the Key Server only gets a brute-force attack against the user's
master password.


[[File:PICL-03-setup-password.png|Setup With Password]]
[[File:PICL-03-setup-password.png|Setup With Password]]
Line 126: Line 146:
lifetime for this component. So using tokens instead of certificates is
lifetime for this component. So using tokens instead of certificates is
simpler and achieves the same goals.
simpler and achieves the same goals.
== Token Versions ==
For each account, the Key Server remembers an integer "version number", which
is incremented each time a user wants to revoke access (e.g. when they change
their master password, or click a "revoke access from all devices" button on
a control panel). This is delivered to the browser along with kA and
wrap(kB).
The version number is included in the key derivation function for tokens, but
not for the data-encryption key. The data-encryption key remains constant
forever (or until we design a more complicated re-encryption-based revocation
scheme). Browsers who have just received the master keys kA/kB/kC will be
able to compute any version of any token they wish, but once they have
successfully talked to their storage server, they will forget kA/kB/kC and
all older tokens, retaining only the most recent token version and the
domain-specific data-encryption key. In this post-login state, if the browser
needs to compute a newer token, it must "re-login" by submitting an assertion
to the Key Server to re-fetch the master keys.
Storage Servers remember just one token for each class of data.
[[File:PICL-06-domain-kdf.png|Token/Key Generation]]
== Storage Server Access Rules ==
For each class, a Storage Servers will hold a row of data with (email, token,
ciphertext). Requests to read or write data must be accompanied with the
matching token, delivered in a confidential channel: Read(token) or
Write(token, new-ciphertext). If the token does not match any known row, an
"UnknownToken" error is returned.
[[File:PICL-08-proto-polling.png|Storage Server Protocol: Polling]]
Account creation and post-revocation update is managed with a second API:
UpdateToken(assertion, new-token, old-tokens). The storage server should
check to see if any of the old-tokens match the stored row: if so, "Success"
is returned, and the stored token is updated to new-token. Else the server
validates the assertion and checks to see if any row matches the included
email address, in which case it returns "KnownUserUnknownToken". If not, the
server creates a new row (with email from the assertion and new-token) and
returns "SlotCreated".
[[File:PICL-09-proto-login.png|Storage Server Protocol: Login]]
The third API is DeleteData(assertion), which validates the assertion and
deletes any data with a matching email address. This allows users to delete
their data even if they cannot remember a password. This may need more
discussion.. it seems like a useful feature, but obviously allows IdPs to
clobber data they cannot read, which could be surprising.
If the server is eager to be RESTful and needs a distinct per-user identifier
to go into the URL for the Read() and Write() APIs, the best choice is to use
a hash of the token. This can be safe (even if we assume that URLs are not
secret) because tokens are derived from full-entropy keys, and thus not
vulnerable to dictionary attacks. When the token is set with UpdateToken(),
the server can compute the UID and record it in an index. The server must
check that both the UID and the token match the recorded data. (Less eager
server designs should just use a common POST URL for all APIs and put the
one token in the request body, omitting any sort of UID).


== Revocation ==
== Revocation ==
Line 132: Line 212:
access, and are likely to express this by changing their master password (if
access, and are likely to express this by changing their master password (if
any), and/or by going to a Key Server control panel of some sort and hitting
any), and/or by going to a Key Server control panel of some sort and hitting
a "revoke devices" button. This will be implemented by changing kA/kB/kC,
a "revoke devices" button (which will require a BrowserID assertion). This
leaving the user to update their remaining devices with the new credentials.
will be implemented by incrementing the version number and updating tokens on
The Key Server will require a BrowserID assertion when performing this step.
all storage servers, so all devices must construct a new token to access the
ciphertext. The user will have to re-log-in on all their devices (with a
current assertion, and the new password) to access the post-revocation data.


Browsers are expected to discard MK (the password-derived master key)
Browsers are expected to discard MK (the password-derived master key)
immediately after obtaining kB, to ensure that nothing is left in the browser
immediately after obtaining kB, to ensure that nothing is left in the browser
that could let an attacker brute-force the master password without online
that could let an attacker brute-force the master password without online
help. But they will retain kA/kB/kC for a while, so that periodic data sync
help. Browsers also discard kA/kB/kC after constructing valid domain-specific
can continue to occur in the background without user intervention.
tokens and keys for each known service (i.e. all add-ons that have registered
to get domain-specific keys). They will retain the domain-specific keys, and
domain-specific tokens for the current version number, so that periodic data
sync can continue to occur in the background without user intervention (until
revoked).
 
When a browser uses the "revoke device" button and increments the version
number, it will immediately re-login to all known services. As the storage
server will still have the old token, the Read(token) call will get an
UnknownToken error, prompting the client to use UpdateToken with the last 5
or 10 versions of the token. Since the user has just re-logged in, we're sure
to have an active BrowserID certificate, so the assertion generation should
not require additional user interaction. The first old-token (vernum-1) will
probably match, but if not (KnownUserUnknownToken) we'll try older and older
tokens until we run down to version=0, at which point we'll give up. When we
get Success, the storage-server has been updated to the newest token, and all
other devices will be unable to access ciphertext until they are updated too.
 
On the not-doing-the-revocation device, the periodic background poll will
suddenly get an UnknownToken error. This will prompt it to re-log-in to the
Key Server, which requires an assertion, which needs user interaction (if the
IdP credentials have expired). Once it learns the new version number, it
computes the latest token, and tries the Read() again. If that succeeds, it
forgets kA/kB/kC as usual. If it fails, the storage server may not yet be
updated (perhaps the original browser didn't hit all the necessary services),
and it uses UpdateToken() to update the server.


Browsers should check in with the Key Server every once in a while, and if
When background polls get an UnknownToken error, they should probably display
they learn that kA has changed, they should immediately delete kA/kB/kC.
an error indication to the user ("unable to sync") and offer a button to
Application code which uses the key-access API should not retain these tokens
re-login, rather than spontaneously popping a login dialog.
and keys, but instead allow the API to regenerate them upon request. This
check should probably be fail-open (if the check cannot be made, retain the
keys).


Uncooperative browsers (stolen devices which are not given the opportunity to
We should consider a mechanism by which browsers can poll to learn when
erase their memory) will still have these keys, and it'd be nice to revoke
vernum has changed, and then proactively forget their domain-specific
them. When the revoking client tells the Key Server to change kA/kB, it
decryption keys. This reduces the window for an attacker to extract keys from
should also speak to all known Storage Servers (using either an assertion or
a stolen device: the moment it learns that a revocation has taken place, it
the old token), to replace the access tokens with new ones, and to re-encrypt
wipes the decrypted plaintext and decryption keys. Then, if it can still
the data with the new key. There are fault conditions in which the client
obtain an assertion, it can re-acquire the keys and start polling again. This
will not be able to use the old token (e.g. if a storage server is offline
"am I revoked" mechanism should not depend upon having an assertion (since
while the transition occurs, or the client crashes during the process), in
these expire before the polling should stop), so perhaps it should use
which case it should use an assertion to replace the tokens and then delete
another token held by the Key Server which can only be used for this sort of
the old encrypted data.
poll.


Using server-data-replacement for revocation isn't the most pleasant scheme.
Using server-data-replacement for revocation isn't the most pleasant scheme.
Line 190: Line 294:
they are the first to use class-C data and thus both responsible for creating
they are the first to use class-C data and thus both responsible for creating
kC (creating incompatible keys).
kC (creating incompatible keys).
== Jetpack Modules ==
The first implementation will probably be a cluster of Jetpack modules. The
"browserid" module must, for now, open a visible tab from an
externally-hosted site, which can use include.js to present the BrowserID
signin dialog. Once BrowserID accepts non-http/https schemes for the
audience= field, this can change to opening a visible tab from an
internally-provided resource: URL. Later, the module can use chrome UI for
the dialog, rather than opening and closing a regular browser tab. The API
should remain the same.
The KDF module is where all the protocol work happens. It should probably
have a register-callback method, then a "go" method (just like
navigator.id.watch() and navigator.id.request()). The callback should receive
the data-encryption keys, functions to encrypt/decrypt data with those keys,
and something to access tokens. We need a way to inform the KDF module that
we've successfully communicated with the storage server, and we no longer
need old tokens (so it can discard kA/kB/kC safely).
The application-specific module then talks to whatever internal resource is
being synchronized, and makes HTTPS requests to its storage server.
[[File:PICL-07-jetpack-modules.png|Jetpack Modules]]
Confirmed users
1,042

edits