Queue-Based Data Synchronization

Chris Karlof, Brian Warner, May-2013

Summary: like Identity/CryptoIdeas/04-Delta-Sync but more stream-oriented than whole-version -oriented.

(this borrows ideas liberally from Chromium, so there is some terminology overlap with that project)

Syncable Service, Sync Mediator, Registration

A "Syncable Service" is any service that wants to synchronize data with a PICL account (and thus with corresponding services on other devices). Bookmarks, passwords, open-tabs, etc, are all examples of Syncable Services.

Each Syncable Service is required to register with the "Sync Mediator" at browser startup. In the registration call, the service includes parameters to identify:

the name of the service (this allows a service on one device to connect with the same service on other devices: both must use the same name)
whether this service uses one-collection-per-device or one-shared-collection
whether this service's data goes into class-A or class-B

the service also provides callback functions as follows:

mergeDataAndStartSyncing
something for downstream changes

Registration returns a function for the service to call when it has upstream changes that need delivery to the server.

Changes vs Records

PICL models the local datastore as a collection of "records", each of which has a globally-unique key (GUID) and some arbitrary value. The server must be able to supply a full set of (encrypted) records at any time (both for new clients which are not yet in sync, and for existing clients that fall out-of-sync for whatever reason).

Once clients are "in sync", they exchange "changes" instead of records. However, if we use a record-based storage format (as opposed to a field-based format, see below), then these are almost identical. The significant difference is that "changes" can also represent a deleted record, and the memory of that deletion needs to stick around for a while.

Suppose the client browser performs the following actions:

step 1: create record key=1, value=A
step 2: create record key=2, value=B
step 3: create record key=3, value=C
step 4: modify record key=1 to value=D
step 5: delete record key=3
step 6: modify record key=1 to value=E

The "current dataset" after each step is:

after step 1: (1=A)
after step 2: (1=A, 2=B)
after step 3: (1=A, 2=B, 3=C)
after step 4: (1=D, 2=B, 3=C)
after step 5: (1=D, 2=B)
after step 6: (1=E, 2=B)

The changes we deliver to implement each step are:

step 1: ADD/SET 1=A
step 2: ADD/SET 2=B
step 3: ADD/SET 3=C
step 4: ADD/SET 1=D
step 5: DELETE 3
step 6: ADD/SET 1=E

Queues

For each service, the Mediator maintains two queues. The "upstream" or "outbound" queue contains local changes that were made to the native datastore (e.g. the Places database), in response to user actions. The upstream queue holds these changes until:

network connectivity is available
some batching timeout has expired (a Nagle-like algorithm to improve efficiency by sending infrequent large updates instead of frequent tiny updates)
any pending downstream changes have been applied and merged into the local datastore

After upstream entries have been sent to the server, they may remain in the queue until the server acknowledges receipt, at which point they are finally deleted. If the server receives an update from some other device (which has not yet been seen by the local device), the server sends a NACK instead, at which point the client will try to merge the other change into the local datastore. Entries in the upstream queue may be removed or modified before transmission as a result of merge actions.

The "downstream" or "inbound" queue contains changes that arrive from the server which have not yet been fully applied to the local datastore.

Each queue contains plaintext changes. The client exchanges only encrypted records/changes with the server. Upstream changes are encrypted just before transmission, and downstream changes are decrypted before being added to the queue.

Server Data Model

Concepts: collection version numbers combined change/record rows, tombstones, "fetch changes since X", hash chain, conflict detection

Browsers send encrypted change records up to the server. Each change record contains the following fields:

record id: hash of header
"header": (version number, PreviousRecordId, unencrypted key (GUID), hash of encrypted value)
encrypted value (or "DELETE")
signature: HMAC (using a client-managed key) of record id

Each change record represents an ADD/SET or a DELETE of a specific key. A complete collection is represented by a bunch of ADD/SET changes for non-overlapping keys (exactly one change per key).

The "collection version number" is a collection-wide sequential integer, incremented with each change. It will eventually get get very large (think 8-byte storage). For any given version number, there is a specific set of key/value pairs which make up that version (although the server may not be able to produce that set for arbitrary version numbers). Each version differs by exactly one key (note: this is a significant constraint, and needs more discussion). Each change record has a copy of the collection version number that first includes the new change. Version numbers are generated by the client, when it produces (and hashes/signs) a new change record for delivery to the server.

These change records form a hash chain: each record id validates all previous records back to the very first one (which has a PreviousRecordId of all zeros). Clients which are "in-sync" and receiving only new records will require valid signatures, sequential version numbers, and matching PreviousRecordId values. These clients cannot be made to accept false records, or be tricked into omitting a valid record (some attacks are still possible during a "resync", see below).

The server receiving upstream records cannot check the (symmetric) signature, but it validates all the other fields. It then stores the various fields in a database. The server schema needs to support two kinds of read operations:

read op 1: "please give me all changes from version number N to the present"
read op 2: "please give me all current records"

If the server does not have enough history to answer the first kind of read with a contiguous set of changes for the requested range, it should return a distictive error, causing the client to perform a resync. But the server must always be able to answer the second kind of read.

The server should use whatever schema proves to be most efficient, but one possible approach would be:

columns: userid, collectionId, buildNumber, guid, isCurrent (bool), recordId, header, encryptedValueOrDELETE, signature
UNIQUE INDEX (userid, collectionId, buildNumber)
for any (userid, collectionId, guid), there is exactly one row with isCurrent=True
read #1: SELECT buildNumber>N ORDER BY buildNumber
read #2: SELECT isCurrent=True and value!=DELETE
deleted keys are represented by "tombstones" with encryptedValue set to a special "DELETE" token
writes that modify or delete an existing key will clear isCurrent from the old row and set isCurrent on the new row.
servers can periodically compress their datasets. They can remove rows where isCurrent=false, and also rows where isCurrent=true and encryptedValue=DELETE. Ideally, servers will retain enough history (including encryptedValue=DELETE) to return changes to bring all known clients up to date, but if a client has not caught up for a long time, servers can compress away their history anyway, and that client will perform a resync if/when it comes back.

With such a schema, the example above would result in the following rows:

ver=1 current=False key=1 val=A
ver=2 current=True key=2 val=B
ver=3 current=False key=3 val=C
ver=4 current=False key=1 val=D
ver=5 current=True key=3 val=DELETE
ver=6 current=True key=1 val=E

When the server compresses out old rows, it would be left with just:

ver=2 current=True key=2 val=B
ver=6 current=True key=1 val=E

Servers must be able to answer these queries with paginated responses, rather than attempting to hold the entire response in memory or transmitting it in a single network message.

[rfkelly]: it's not clear how the "build number" is generated? Is it sequential per record, or global to the collection? Subtext: I want to understand if the server has to read the previous version of each record before writing the new version.

Server Conflict Detection

When the server receives upstream records, it compares the submitted versionNumber and PreviousRecordId against its current record: the versionNumber should be one greater, and the PreviousRecordId should match the current RecordId. The server will NACK the upstream message and discard the record unless these match. If both match, the server will store the upstream record (thus incrementing its current buildNumber) and ACK the message.

When two clients race to submit different records, the first will be accepted, and the second will be NACKed. The second client will receive both the NACK and the announcement of the other record, causing it to merge the other record and then re-submit its own (possibly modified as a result of the merge).

Downstream Change Application

race detection and merge, scanning/modifying the upstream queue. Filtering downstream changes from the upstream observer.

Upstream Change Delivery

new build-number calculation, hash-chain calculation, ACK/NACK.

Initial Merge

mergeDataAndStartSyncing, per-datatype merge functions, race detection and merge. Large upstream change stream.

Downstream Cache

For simplicity (in particular to decouple transactionality between the native datastore and the downstream queue), we may have the browser perform a "resync" at every boot. To avoid re-fetching the entire server dataset each time, we can maintain a full copy of the server's dataset in a "Downstream Cache". This is updated when we receive downstream changes, with a transaction that simultaneously updates the cached data and the new build number. With this, we can safely request only new changes each time. In the ideal case (where nothing has changed on the server), a single roundtrip (returning or confirming the current build number) is enough to make sure we're up-to-date.

Identity/CryptoIdeas/05-Queue-Sync

Contents

Queue-Based Data Synchronization

Syncable Service, Sync Mediator, Registration

Changes vs Records

Queues

Server Data Model

Server Conflict Detection

Downstream Change Application

Upstream Change Delivery

Initial Merge

Downstream Cache

Navigation menu

Identity/CryptoIdeas/05-Queue-Sync

Queue-Based Data Synchronization

Syncable Service, Sync Mediator, Registration

Changes vs Records

Queues

Server Data Model

Server Conflict Detection

Downstream Change Application

Upstream Change Delivery

Initial Merge

Downstream Cache

Navigation menu

Search