Apps/Sync/Spec2: Difference between revisions

From MozillaWiki
< Apps‎ | Sync
Jump to navigation Jump to search
No edit summary
Line 19: Line 19:
We'll go out-of-order time-wise, and forget about authentication for now.
We'll go out-of-order time-wise, and forget about authentication for now.


Everything happens at a single URL endpoint, we'll call it /user
Everything happens at a single URL endpoint, we'll call it <tt>/USER</tt>
 
==== Objects ====


Each object looks like this:
Each object looks like this:
Line 34: Line 36:


The <tt>id</tt> key must be unique for the type (submitting another key by the same id means that you are overwriting that object).
The <tt>id</tt> key must be unique for the type (submitting another key by the same id means that you are overwriting that object).
==== Requests ====
You can retrieve and send updates.  The first time there's not a lot to do, you can just do:
    GET /USER
This returns the response document:
    {collection_id: "string_id",
    objects: [[counter1, object1], [counter2, object2]]
    }
The <tt>collection_id</tt> key is only in there because it was not sent with the request; it's a kind of "hello" query.
Subsequent requests look like:
    GET /USER?since=counter2&collection_id=string_id
You get the objects back, but with no <tt>collection_id</tt> (you already know it!)
If <tt>objects</tt> is empty, you start with a counter <tt>0</tt>.
If the collection has changed, and your <tt>string_id</tt> doesn't match the server anymore, then you'll get:
    {collection_changed: true,
    collection_id: "new_id",
    objects: [[counter1, ...], ...]
    }
You should then forget your remembered <tt>since</tt> value and all the updates you have sent to the server.
When you have updates you want to send, you do:
    POST /USER?since=counter2&collection_id=string_id
   
    [{id: "my obj1", type: "thingy", data: {...}, ...]
This may return a <tt>collection_changed</tt> error, but also there may have been an update since you last retrieved objects.  This will not do!  The <tt>since=counter2</tt> shows when you last got something.  If there have been updates you get a new GET-like response:
    {since_invalid: true,
    objects: [[counter3, object]]
    }
You should incorporate the new object (which might conflict some with your own objects, which is why we do all this!), and then resubmit the request:
    POST /USER?since=counter3&collection_id=string_id
    ...
===== Conflicts =====
We do not resolve conflicts as part of sync, and you are strongly recommended not to burden your users with conflicts as part of your sync schedule.
In some cases you can resolve conflicts yourself.  For instance, if the data is not very interesting, you can just choose a winner.
If you can't automatically resolve the conflicts you must incorporate all your conflicting edits into a new object, and when the user at some point can attend to the object you can show them the conflicts and ask for a resolution, putting the resolved object onto the server.
===== Partial Results =====
You may not want too many results.  In this case add to your GET requests:
    GET /USER?...&limit=10
This will return at most 10 items.  The server may also choose not to return a full set of items.  In either case the result object will have <tt>incomplete: true</tt>.  You can make another request and get more items.
===== Typed Results =====
Sometimes you only care about a subset of objects.  The stream can have any number of types of objects, and while a full client may handle everything a more limited client may not care about some items.  In this case do:
    GET /USER?...&include=type1&include=type2
This gives you only <tt>type1</tt> and <tt>type2</tt> objects.  Instead of opting in to some objects, you can also opt-out with <tt>exclude=type1&exclude=type2</tt>.
The response may include <tt>until: "counter3"</tt>, which might be newer than the newest item that was returned (this happens when the newest item is not of the type you requested).

Revision as of 04:30, 13 December 2011

This document serves as an entirely speculative revision to the Apps/Sync/Spec specification, with an attempt to make it a bit cleaner and more general.

Expectations

The model of sync is a stream of updates. All clients both put their local updates into this stream, and read the collective stream. Everything has to be represented as a concrete item in the stream, meaning that delete actions are also present in the stream.

There is no conflict resolution, so clients must make sure they do not overwrite each other's updates. If a conflict cannot be resolved without interaction (e.g., simple overwrite is not considered acceptable, and automatic merging is not possible) then it must be possible to represent the conflicted state directly, and at some point some client can resolve the conflict (possibly with user interaction) and put the unconflicted object into the stream.

The stream is ordered, along a single timeline. The timeline markers should not be seen as based on any time or clock, as this leads to confusion and it's not clear whose "now" we are talking about. Instead the server has a counter, and all clients work from that counter. (The counter need not be an uninterrupted stream of integers, just increasing.)

All interaction between client and server should happen without user intervention. Everything is expected to be highly asynchronous, and the server may reject requests or be unavailable for short periods of time, and this should not affect user experience.

We expect for a new client to be able to create a good-enough duplicate of the data in other clients. "Good-enough" because some data might be kept by clients but expired by the server because it was marked as not being permanently interesting.

For "known" datatypes the sync server ensures the integrity of data, according to the most up-to-date notion of correctness for the data type. As such the sync server must be updated frequently, but clients will be protected from some other rogue clients. (Note: not sure if this is a practical expectation?)

Protocol

We'll go out-of-order time-wise, and forget about authentication for now.

Everything happens at a single URL endpoint, we'll call it /USER

Objects

Each object looks like this:

   {type: "type_name",
    id: "unique identifier among objects of this type",
    expires: timestamp,
    data: {the thing itself}
   }

Note the data can be any JSONable object, including a string.

The expires key is entirely optional, and allows the server to delete the item (if it has not otherwise been updated).

The id key must be unique for the type (submitting another key by the same id means that you are overwriting that object).

Requests

You can retrieve and send updates. The first time there's not a lot to do, you can just do:

   GET /USER

This returns the response document:

   {collection_id: "string_id",
    objects: [[counter1, object1], [counter2, object2]]
   }

The collection_id key is only in there because it was not sent with the request; it's a kind of "hello" query.

Subsequent requests look like:

   GET /USER?since=counter2&collection_id=string_id

You get the objects back, but with no collection_id (you already know it!)

If objects is empty, you start with a counter 0.

If the collection has changed, and your string_id doesn't match the server anymore, then you'll get:

   {collection_changed: true,
    collection_id: "new_id",
    objects: [[counter1, ...], ...]
   }

You should then forget your remembered since value and all the updates you have sent to the server.

When you have updates you want to send, you do:

   POST /USER?since=counter2&collection_id=string_id
   
   [{id: "my obj1", type: "thingy", data: {...}, ...]

This may return a collection_changed error, but also there may have been an update since you last retrieved objects. This will not do! The since=counter2 shows when you last got something. If there have been updates you get a new GET-like response:

   {since_invalid: true,
    objects: counter3, object
   }

You should incorporate the new object (which might conflict some with your own objects, which is why we do all this!), and then resubmit the request:

   POST /USER?since=counter3&collection_id=string_id
   ...
Conflicts

We do not resolve conflicts as part of sync, and you are strongly recommended not to burden your users with conflicts as part of your sync schedule.

In some cases you can resolve conflicts yourself. For instance, if the data is not very interesting, you can just choose a winner.

If you can't automatically resolve the conflicts you must incorporate all your conflicting edits into a new object, and when the user at some point can attend to the object you can show them the conflicts and ask for a resolution, putting the resolved object onto the server.

Partial Results

You may not want too many results. In this case add to your GET requests:

   GET /USER?...&limit=10

This will return at most 10 items. The server may also choose not to return a full set of items. In either case the result object will have incomplete: true. You can make another request and get more items.

Typed Results

Sometimes you only care about a subset of objects. The stream can have any number of types of objects, and while a full client may handle everything a more limited client may not care about some items. In this case do:

   GET /USER?...&include=type1&include=type2

This gives you only type1 and type2 objects. Instead of opting in to some objects, you can also opt-out with exclude=type1&exclude=type2.

The response may include until: "counter3", which might be newer than the newest item that was returned (this happens when the newest item is not of the type you requested).