CloudServices/NativeSync/Bookmarks and History

From MozillaWiki
Jump to: navigation, search

Impedance mismatching for bookmarks sync on Android

We have two main problems here, and a scattering of smaller ones. These are:

  • Sync has a broader schema than Android. This comes from Places; columns like loadInSidebar and tags. These must be preserved across sync operations.
  • Sync requires a way of knowing which entries have changed (typically a last modified timestamp). (The alternative is to do a — potentially messy — full sync to and from the server on each sync.)
  • Ideally it also needs fine-grained change tracking to enable reconciliation.
  • Some records (such as Places queries) will have no analogue on the Android side, and must simply be preserved without being applied.

It turns out that the Android system bookmarks store shares a schema with history, and this schema is anemic¹. A bookmark is simply a history item with no visit, which means no timestamp. When we sync, then, how do we know which items have changed?

The obvious solution to all three of these problems is to have on the device a store which mimics the schema that Sync expects. This store persists, and represents a combination of upstream and downstream information as of the last sync. Having full control over this store, we are able to maintain and extract additional knowledge over time from the system store.

We are consequently (albeit at some expense²) able to answer the question of what records have changed: the items that have changed are the ones for which the existing stored row no longer matches the system row. To make reference to the desktop Firefox Sync codebase, this is a kind of “retrospective tracking” — we can compute the set of changes since the last sync by comparing to a snapshot. The same applies for history: we can deduce the changed records by looking for a mismatch of visit counts.

We can make a reasonable approximation of a modification time, too: we'll take the time at which this sync operation began, which is an upper bound on the true modification time. (Unless we failed to sync the item up to the server on the last sync, of course. In that case we use the timestamp from the sync in which we discovered the change…)

(A similar approach allows us to deduce visit instances in a time range for history, and thus match Places' schema.)

This combined data store also allows us to solve the schema mismatch problem. Changes on the Android side don't affect the additional columns in our store, so we can round-trip.

Concrete algorithms

Still sketching these out!

Initial setup

  • Browser content provider can be empty or full.
  • Upstream Sync account can be empty or full.
    • If full, user can specify wipe local, wipe remote, or merge.
    • If empty, equivalent to wipe remote.
  • Wipe remote: construct local rows with current time as modification time. Generate GUIDs. Upload all.
  • Wipe local: wipe local store. Download rows and write-through insert into local stores.
  • Merge:
    • Compute local rich store. No GUIDs.
    • Download remote records. Reconcile each using only Android columns.
      • Remote wins, new record. Insert locally.
      • Remote wins, existing record.
        • Add new columns to existing record. Apply remote GUID.
      • Local wins. Always a new record. Generate GUID and upload.
    • Flag records for uploading.

Subsequent syncs

  • Changed records computed by exhaustive search. (Tracker populated.)
  • Remote changes retrieved.
  • Local new records merged as in initial setup.
  • Deltas for old local and remote records reconciled and applied (either locally or uploaded).

Database Schemas

Bookmarks

This is the schema outlining our snapshot database for Bookmarks. The tags and children field may need to change as these are actually arrays of strings, for now this works fine though.

Field NameField Type
idINTEGER PRIMARY KEY AUTOINCREMENT
guidTEXT
androidIdINTEGER
titleTEXT
bmkURITEXT
descriptionTEXT
loadInSidebarINTEGER
tagsTEXT
keywordTEXT
parentIDTEXT
parentNameTEXT
typeTEXT
generatorUriTEXT
staticTitleTEXT
folderNameTEXT
queryIdTEXT
siteUriTEXT
feedUriTEXT
posTEXT
childrenTEXT
modifiedINTEGER
deletedINTEGER DEFAULT 0

History

TBD

Alternatives / futures

One approach to this whole thing is to implement an “EnhancedBrowser” content provider (see this article). This would extend the existing Browser content provider, seamlessly tracking changes along the way. This still requires the same additional storage and logic — if we merely tracked calls through our own API then use of the existing Browser content provider would cause us to drift out of sync — so I view it as an extension which can provide additional benefits, including real-time modification time tracking.

The mobile team does not currently plan to support tags, notes, etc., for bookmarks, but might be interested if it were free… and if we offered a content provider then it would be pretty cheap to expose the additional data.


Footnotes

¹ Browser.BookmarkColumns.

² We would implement this with a managedQuery cursor over your entire bookmarks list, comparing each item to our tracked row. See this article.