Services/KeyExchange

From MozillaWiki
Jump to: navigation, search

This wiki page has been integrated in the Mozilla Services doc portal: http://docs.services.mozilla.com/keyexchange/index.html.


Overview

Explore using J-PAKE to securely pass credentials to another device.

Tracking bug is bug 601644.

Engineers Involved

  • Tarek (server)
  • Philipp (FxSync)
  • Stefan (FxHome)

User Requirements

  • Setting up a new mobile device should only involve entering a short code on the desktop device
  • Secondary request, not a hard requirement, is that if the user has a mobile device, and is setting up a desktop device, that the flow is similar and still involves entering the key on the desktop

Desired User Flow

  1. User chooses "quick setup" on new device
  2. Device displays a setup key that contains both the initial secret and a channel ID
  3. On a device that is authenticated, user chooses "add another device" and is prompted for that key
  4. The two devices exchange messages to build the secure tunnel
  5. The already-authenticated device passes all credentials (username/password/sync key) to the new device
  6. New device completes setup and starts syncing

Implementation

Terminology

  • Desktop: Client that has Fx Sync already set up
  • Mobile: Client that needs to be set up (of course this could be another desktop computer, too)
  • PIN: code that is displayed on Mobile and entered on Desktop
  • Secret: weak secret that is used to start the J-PAKE algorithm
  • Key: strong secret that both clients derive through J-PAKE

Overview

  • Mobile and Desktop complete the two roundtrips of J-PAKE messages to agree upon a strong secret K
  • A 256 bit key is derived from K using HMAC-SHA256 using a fixed extraction key.
  • The encryption and HMAC keys are derived from that 256 bit key using HMAC-SHA256.
  • In third round trip:
    • Mobile encrypts the known message "0123456789ABCDEF" with the AES key and uploads it.
    • Desktop verifies that against the known message encrypted with its own key, encrypts the credentials with the encryption key and uploads the encrypted credentials in turn, adding a HMAC-SHA256 hash of the ciphertext (using the HMAC key).
    • Mobile verifies whether Desktop had the right key by checking the ciphertext against the HMAC-SHA256 hash.
    • If that verification is successful, Mobile decrypts ciphertext and applies credentials


Mobile                        Server                        Desktop
===================================================================
                                 |
retrieve channel <---------------|
generate random secret           |
show PIN = secret + channel      |                 ask user for PIN
upload Mobile's message 1 ------>|
                                 |----> retrieve Mobile's message 1
                                 |<----- upload Desktop's message 1
retrieve Desktop's message 1 <---|
upload Mobile's message 2 ------>|
                                 |----> retrieve Mobile's message 2
                                 |                      compute key
                                 |<----- upload Desktop's message 2
retrieve Desktop's message 2 <---|
compute key                      |
encrypt known value ------------>|
                                 |-------> retrieve encrypted value
                                 | verify against local known value
                                 |              encrypt credentials
                                 |<------------- upload credentials
retrieve credentials <-----------|
verify HMAC                      |
decrypt credentials              |

Key derivation

The AES encryption key T(1) and the HMAC key T(2) will be derived from J-PAKE's strong secret K as follows:

extraction_key = "\x00" * 32
key_string = HMAC-SHA256(K, extraction_key)
T(1) = HMAC-SHA256(key_string, "" + "Sync-AES_256_CBC-HMAC256" + 0x01)
T(2) = HMAC-SHA256(key_string, T(1) + Sync-AES_256_CBC-HMAC256" + 0x02)

(See http://tools.ietf.org/html/rfc5869)

To verify the key on both ends, the value

 0123456789ABCDEF

is encrypted with the AES key.

Data format

All request and response bodies are JSON objects as produced by python-jpake (messages 1 and 2) and specified below. An application/json HTTP Content-Type header is optional. Within the JSON objects,

  • the big numbers are encoded as hex strings (messages 1 and 2),
  • the ciphertext, IV and hmac are encoded in Base64 (messages 3)
  • the credentials are a JSON object with the following properties (UTF-8 encoded strings):
    • account,
    • password,
    • synckey,
    • serverURL

Server API

The only valid HTTP response codes are 200 and 304 since those are part of the protocol and expected to happen. Anything else, like 400, 403, 404 or 503 must result in a complete termination of the password exchange. The client can retry the exchange then at a later time, starting all over with clean state.

Every call must be done with a X-KeyExchange-Id header, containing a half-session identifier for the channel. This client ID must be a string of 256 chars. The server will keep track of the two first ids used for a given channel, from its creation to its deletion and will close the channel and issue a 400 if any request is made with an unknown id or with no id at all.

Last, if a given IP attempts to flood the server with a lot of calls in a short time, it will be blacklisted for 10 minutes return 403s in the interim for any requests made from the same IP. When receiving this error code, legitimate clients can fall back to a manual transaction. A client that generates a lot of bad requests will also be blacklisted, but for an hour.


The server API are:

GET https://server/new_channel

 Returns in the response body a JSON-encoded random channel id of N chars 
 from [a-z0-9].
   
 When the API is called, The id returned is guaranteed to be unique. 
 The channel created will have a limited TTL (currently configured to 
 5 minutes).
     
 Return codes:
    - 200: channel created successfully  
    - 503: the server was unable to create a new channel.
    - 400: Bad or no client ID. The channel is deleted.
    - 403: the IP is blacklisted. 


GET https://server/channel_id

 Returns in the response body the content of the channel of id channel_id. 
 Returns an ETag response header containing a unique hash.
 
 The request can contain a If-None-Match header containing a hash,
 If the hash is similar to the current hash of the channel, the server 
 will return a 304 and an empty body.
 
 The number of GET calls for a given channel are limited to 6. The channel will
 be deleted by the server after 6 successful GETs.
   
 Return codes:
    - 200: data retrieved successfully  
    - 404: the channel does not exists. It was not created by a call 
           to new_channel or timed out.
    - 304: the data was not changed.
    - 400: Bad or no client ID. The channel is deleted.
    - 403: the IP is blacklisted. 


PUT https://server/channel_id

 Put in the channel of id channel_id the content of the request body. 
 Returns an ETag response header containing a unique hash.
    
 Return codes:
    - 200: data set successfully  
    - 404: the channel does not exists. It was not created by a call 
           to new_channel or timed out.
    - 400: Bad or no client ID. The channel is deleted.
    - 403: the IP is blacklisted. 


POST https://server/report

 Reports a log to the server, and optionally ask for a channel deletion. 
 
 The log is the body of the request. If the 
 request contains a X-KeyExchange-Log header, its value is prepended
 to the log provided in the body. In other words, the header can be used
 for small logs, and the body for more info. The body size is limited to 
 2000 chars.  If both body and headers are empty, a 400 is raised.
   
 Optionally, if the request contains the X-KeyExchange-Id header and a
 X-KeyExchange-Cid header containing the channel id, the channel will
 be deleted by the server.
 
 Return codes:
    - 200: logged successfully  
    - 403: the IP is blacklisted.
    - 400: bad request (missing log or bad ids)

The messages reported are described at Suggestions for things we will log through the J-PAKE /report API

Failure modes

jpake.error.timeout (Timeout)

Reported when the exchange is aborted due to timeouts.

jpake.error.invalid (Invalid message)

Reported when an malformed message is received. A malformed message is one that doesn't correctly parse as JSON.

jpake.error.wrongmessage (Wrong message)

Reported when the wrong message is received, as identified by the type property in the JSON blob.

jpake.error.internal (Internal J-PAKE failure)

Reported when a J-PAKE computation step or encryption/decryption step fails.

jpake.error.keymismatch (Key mismatch)

Reported when the SHA256d or HMAC verification fails, in other words when the PIN wasn't entered correctly and both sides ended up with different keys.

jpake.error.server (Unexpected Server Response)

Reported when unexpected HTTP response from the J-PAKE server is received.

jpake.error.userabort (User Abort)

Reported when a client aborts the J-PAKE transaction; for example, when canceling a setup wizard. This is new as of 2011-04-14.

Detailed Flow

  1. Mobile asks server for new channel ID (3 characters a-z0-9)
    C: GET /new_channel HTTP/1.1
    S: "a7id"
  2. Mobile generates PIN from random weak secret (4 characters a-z0-9) and the channel ID, computes and uploads J-PAKE msg 1. New for v2: To prevent double uploads in case of retries, the If-None-Match: * header is specified. This makes sure that the message is only uploaded if the channel is empty. If it is not then the request will fail with a 412 Precondition Failed which should be considered the same as 200 OK. The 412 will also contain the Etag of the data was the client just uploaded.
    C: PUT /a7id HTTP/1.1
    C: If-None-Match: *
    C: 
    C: {
    C:    'type': 'receiver1',
    C:    'payload': {
    C:       'gx1': '45...9b',
    C:       'zkp_x1': {
    C:          'b': '09e22607ead737150b1a6e528d0c589cb6faa54a',
    C:          'gr': '58...7a'
    C:          'id': 'receiver',
    C:       }
    C:       'gx2': 'be...93',
    C:       'zkp_x2': {
    C:          'b': '222069aabbc777dc988abcc56547cd944f056b4c',
    C:          'gr': '5c...23'
    C:          'id': 'receiver',
    C:       }
    C:    }
    C: }
    

    Success response:

    S: HTTP/1.1 200 OK
    S: ETag: "etag-of-receiver1-message"
    

    Response that will be returned on retries if the Desktop already replaced the message.

    S: HTTP/1.1 412 Precondition Failed
    S: ETag: "etag-of-receiver1-message"
    
  3. Desktop asks user for the PIN, extracts channel ID and weak secret, fetches Mobile's msg 1
    C: GET /a7id HTTP/1.1
    

    Success response:

    S: HTTP/1.1 200 OK
    S: ETag: "etag-of-receiver1-message"
    
  4. Desktop computes and uploads msg 1. New in version 2: The If-Match header is set so that we only upload this message if the other side's previous message is still in the channel. This is to prevent double PUTs during retries. If a 412 is received then it means that our first PUT was actually correctly received by the server and that the other side has already uploaded it's next message. So just consider the 412 to be a 200.
    C: PUT /a7id HTTP/1.1
    C: If-Match: "etag-of-receiver1-message"
    C: 
    C: {
    C:    'type': 'sender1',
    C:    'payload': {
    C:       'gx1': '45...9b',
    C:       'zkp_x1': {
    C:          'b': '09e22607ead737150b1a6e528d0c589cb6faa54a',
    C:          'gr': '58...7a'
    C:          'id': 'sender',
    C:       }
    C:       'gx2': 'be...93',
    C:       'zkp_x2': {
    C:          'b': '222069aabbc777dc988abcc56547cd944f056b4c',
    C:          'gr': '5c...23'
    C:          'id': 'sender',
    C:       }
    C:    }
    C: }
    

    Success response:

    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-sender1-message"
    
    S: HTTP/1.1 412 Precondition Failed
    S: Etag: "etag-of-sender1-message"
    
  5. Mobile polls for Desktop's msg 1
    C: GET /a7id HTTP/1.1
    C: If-None-Match: "etag-of-receiver1-message"
    
    S: HTTP/1.1 304 Not Modified
    

    Mobile tries again after 1s

    C: GET /a7id HTTP/1.1
    
    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-sender1-message"
    ...
    

    Mobile computes and uploads msg 2. New in version 2: The If-Match header is set so that we only upload this message if the other side's previous message is still in the channel. This is to prevent double PUTs during retries. If a 412 is received then it means that our first PUT was actually correctly received by the server and that the other side has already uploaded it's next message. So just consider the 412 to be a 200.

    C: PUT /a7id HTTP/1.1
    C: If-Match: "etag-of-sender1-message"
    C: 
    C: {
    C:    'type': 'receiver2',
    C:    'payload': {
    C:       'A': '87...82',
    C:       'zkp_A': {
    C:          'b': '6f...08',
    C:          'id': 'receiver',
    C:          'gr': 'f8...49'
    C:       }
    C:    }
    C: }
    
    S: HTTP/1.1 200 OK
    S: ETag: "etag-of-receiver2-message"
    
    S: HTTP/1.1 412 Precondition Failed
    S: ETag: "etag-of-receiver2-message"
    
  6. Desktop polls for and eventually retrieves Mobile's msg 2
    C: GET /a7id HTTP/1.1
    C: If-None-Match: "etag-of-sender1-message"
    
    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-receiver2-message"
    ...
    
    S: HTTP/1.1 412 Precondition Failed
    S: Etag: "etag-of-receiver2-message"
    ...
    

    Desktop computes key, computes and uploads msg 2. New in version 2: The If-Match header is set so that we only upload this message if the other side's previous message is still in the channel. This is to prevent double PUTs during retries. If a 412 is received then it means that our first PUT was actually correctly received by the server and that the other side has already uploaded it's next message. So just consider the 412 to be a 200.

    C: PUT /a7id HTTP/1.1
    C: If-Match: "etag-of-receiver2-message"
    C: 
    C: {
    C:    'type': 'sender2',
    C:    'payload': {
    C:       'A': '87...82',
    C:       'zkp_A': {
    C:          'b': '6f...08',
    C:          'id': 'sender',
    C:          'gr': 'f8...49'
    C:       }
    C:    }
    C: }
    
    S: HTTP/1.1 200 OK
    S: ETag: "etag-of-sender2-message"
    
    S: HTTP/1.1 412 Precondition Failed
    S: ETag: "etag-of-sender2-message"
    
  7. Mobile retrieves Desktop's msg 2
    C: GET /a7id HTTP/1.1
    C: If-No-Match: "etag-of-receiver2-message"
    
    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-sender2-message"
    { 'type': 'sender2', ... }
    
    S: HTTP/1.1 412 Precondition failed
    S: Etag: "etag-of-sender2-message"
    

    Mobile computes key, uploads encrypted known message "0123456789ABCDEF" to prove its knowledge (msg 3). New in version 2: The If-Match header is set so that we only upload this message if the other side's previous message is still in the channel. This is to prevent double PUTs during retries. If a 412 is received then it means that our first PUT was actually correctly received by the server and that the other side has already uploaded it's next message. So just consider the 412 to be a 200.

    C: PUT /a7id HTTP/1.1
    C: If-Match: "etag-of-sender2-message"
    C: 
    C: {
    C:    'type': 'receiver3',
    C:    'payload': {
    C:       'ciphertext': "base64encoded=",
    C:       'IV': "base64encoded=",
    C:    }
    C: }
    
    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-receiver3-message"
    
    S: HTTP/1.1 412 Precondition failed
    S: Etag: "etag-of-receiver3-message"
    
  8. Desktop retrieves Mobile's msg 3 to confirm key
    C: GET /a7id HTTP/1.1
    C: If-No-Match: ""
    
    S: HTTP/1.1 200 OK
    C: ETag: "etag-of-receiver3-message"
    ...
    

    Desktop verifies it against its own version. If the hash matches, it encrypts and uploads Sync credentials.

    New in version 2: The If-Match header is set so that we only upload this message if the other side's previous message is still in the channel. This is to prevent double PUTs during retries. If a 412 is received then it means that our first PUT was actually correctly received by the server and that the other side has already uploaded it's next message. So just consider the 412 to be a 200.

    C: PUT /a7id HTTP/1.1
    C: If-Match: "etag-of-receiver3-message"
    C: 
    C: {
    C:    'type': 'sender3',
    C:    'payload': {
    C:       'ciphertext': "base64encoded=",
    C:       'IV': "base64encoded=",
    C:       'hmac': "base64encoded=",
    C:    }
    C: }
    
    
    S: HTTP/1.1 200 OK
    S: Etag: "etag-of-sender3-message"
    
    S: HTTP/1.1 412 Precondition failed
    S: Etag: "etag-of-sender3-message"
    

    If the hash does not match, the Desktop deletes the session.

    C: DELETE /a7id HTTP/1.1
    
    S: HTTP/1.1 200 OK
    ... 
    

    This means that Mobile will receive a 404 when it tries to retrieve the encrypted credentials.

  9. Mobile retrieves encrypted credentials
    C: GET /a7id HTTP/1.1
    C: If-None-Match: "etag-of-receiver3-message"
    
    S: HTTP/1.1 200 OK
    ... 
    

    decrypts Sync credentials and verifies HMAC.

  10. Mobile deletes the session [OPTIONAL]
    C: DELETE /a7id HTTP/1.1
    
    S: HTTP/1.1 200 OK
    ... 
    

Security Considerations

Security Logging & Defense

DOS Defense

  • Least Recently Used (LRU) queue approach for monitoring IP addresses issuing frequent requests
    • Configurable threshold for adding IP address to Blacklist/Penalty Box
    • Configurable time-out for IP addresses added to Blacklist/Penalty Box
  • A single shared blacklist will exist within memcache
  • LRU queues will be unique to each server and will penalize an IP to the shared blacklist on memcache
  • All thresholds will be controlled via the configuration page

TearDown DOS Defense

  • Tear down requires valid channel and valid x-keyexchange-id value
  • Statistically unlikely. Channel is 4 characters and keyexchange-id is 255 characters
  • Brute force attempts will generate lots of noise and will be limited per DOS defense

Logging Points

CEF Logging

  • Bad action taken against a valid channel id (denoted by 400 error code)
    • Examples: non-existent x-keyexchange-id, bad x-keyexchange-id
  • Action taken against an invalid channel id
    • Examples: request for properly formed, but not existing, channel id
  • IP address sent to black list due to DOS prevention controls
    • Examples: Flood of requests from a single IP
  • Client fallback to original sync method
    • Examples: Client unable to complete J-PAKE sync for any number of reasons and falls back to original sync approach
    • Reported by client to server via reporting API

Application Logging

  • Full application logging will be created to enable incident response review
  • Logged to application server and not via CEF
  • Logs will include:
    • Timestamp
    • IP address
    • Full URL
    • x-keyexchange-id
    • Event
    • Other non-essential headers will be discarded

Admin Web Page

  • A small web administrator page will be created which will allow an admin to view all IP addresses that are currently blacklisted.
  • The admin will be able to un-block any of the IP addresses through this page
  • Otherwise the IP address will be removed from the black list after the time has elapsed that is defined within the configuration file
  • Access to the web page will be password-protected with a simple .htaccess file and IP filtering access (10.*.*.*)

Brian's Notes

  • J-PAKE paper appears to have limited peer review. It was a workshop paper and it appears to only recently have been sent to a peer-reviewed journal. I could not find any citations of the paper except in papers by the author.
  • We would be the first and only known production deployment of J-PAKE (AFAICT).
  • 3 character channel ID means that there can be a maximum of 46,646 active channels.
  • Very easy to DoS. How do we stop DoS?
  • How long do channels live? Long enough to start the process at work, then drive home and complete the process on my home computer means at least 2 hours? If we allow users a day or half a day to travel to their other computer, then we're looking at a limit of about 50K-100K transactions per day, worst case, assuming no DoS. Most likely channels will live just a minute or two.
  • Length of PIN determines resistance to online attacks by the server (Theorem 5 in the J-PAKE paper).
  • 3 character PIN => 1/46,646 chance of server guessing password correctly per sync attempt.
  • Attackers cannot hide attack from the user, but users are likely to retry many times. Is it possible to reliably detect an attack vs. user error, server failure, or network error?
  • If we assume the user is willing to retry 10 times, there is a 1/4,664 chance of a successful online attack by the server.
  • What is an acceptable rate of successful online attack?
  • Do we need mutual authentication? It seems one-way authentication and/or key transport ala PKCS#12 would suffice.
  • It isn't obvious to me what modulus/order is appropriate. The paper used p=1024 and q=160 in a sample implementation, but that wasn't even a recommendation.
  • A full 128-bit AES key can be entered in 25 characters (base 36). Using PBKDF2 or similar, we can (probably) safely reduce that to ~23.
  • Key transport (ala PKCS#12) has equal complexity for online and offline attacks.
  • Increasing the channel ID length and/or the password length reduces the usability benefit of J-PAKE. How much more usable is a 12 or 16 character password vs a 23 character password?
  • J-PAKE server legal juristiction? Separate server from data?
  • J-PAKE server doesn't need to know / should not know the identity (email/username) of the user?
  • Stefan Arentz wrote: "For example, I was just reading a discussion about J-PAKE and found out that there is a possible weakness in the password hashing that we currently use. (Where Hash(password) % q might be 0). See discussion in http://www.lightbluetouchpaper.org/2008/05/29/j-pake/

Meeting Notes

  1. Easy setup (substantially easier than now) is a blocker for Fennec 4 since Sync will be marketed as **the** feature for Firefox Mobile. We need a solution for Firefox + Fennec 4.
  2. J-PAKE algorithm as proposed here allows exchange by only requiring typing of relatively short PIN on device that's already set up (though a trivial change could allow the typing to always be on the desktop machine, no matter whether it's the receiving or sending end).
  3. Did we look at alternatives to J-PAKE?
    1. QR codes: necessary platform work not possible on all platforms in the given time frame
    2. Bonjour/Zeroconf: same as above
  4. Concerns
    1. Confidence in J-PAKE: paper submitted to journal for official publication only recently, no peer review yet.
    2. Short PIN as proposed by UX makes channel hijacking, guessing easier
      1. Additionally, large quantities of channels to support user volume will either significantly lengthen this string, or reduce PIN space and strength
    3. Firefox 4 timeframe short for implementation + crypto review
    4. DoS
    5. Changes to marketing messages necessary, are we willing to qualify our statements about Sync security + privacy?
  5. Proposals to prevent simple attacks
    1. Connections to the PAKE server should be over SSL, eliminates man-in-the-middle attacks.
    2. Channel exhaustion from DoS: need effective IP blocker
    3. Only allow client that requests channel + the next client that connects to it to use the channel (limits eavesdropping/manipulation attacks)
    4. Only allow a limited number of attempts to use transfer via J-PAKE, fall back to traditional account setup.
    5. Client flags channel deletes that happen because of an abort.
  6. Potential attacks (after above measures)
    1. Compromised server does an online attack
    2. Hijack channel before user enters the PIN. Need to guess whole PIN (channel + secret) to do harm.
  7. Alternative suggestions
    1. Various word or sentenced based methods, all of which are pretty much impossible to localize.
      1. Maybe: could localize the PGP bio word list (http://en.wikipedia.org/wiki/PGP_word_list)
    2. Have the mobile display a ~20 a-z character key (~100 bits of entropy) which user enters on the desktop. This 20 char key is used to make a 128 bit AES key, the hash of the key is the channel ID on the server. Mobile encrypts data, uploads to channel, desktop downloads and decrypts. UX is worse (but still better because you enter ~20 chars on desktop rather than email + password + Sync Key on mobile), security is better.
      1. Arguable that users are already familiar with entering separated alphanumeric sequences: WEP keys, registration keys, license codes, phone numbers, ZIP codes, credit card numbers...