Places/AsyncAPIsForSync
Contents
Overview
Tracking bug for Sync: bug 606353
Current situation
Right now Sync calls various synchronous Places API methods to read and write records, with the exception of history reads where it rolls its own asynchronous SQL queries.
Problem
Synchronous sqlite I/O is hurting us a lot on mobile.
Proposed solution
Provide powerful asynchronous methods to replace Sync's many synchronous calls. So instead of adding a bookmark and then adding a bunch of annotations to it, it would be great if Places had an API that would let us do it one go, do all the writes async and then call us back.
Bookmarks
Description of Sync's bookmark record (includes folders, separators, livemarks, etc.): https://wiki.mozilla.org/Labs/Weave/Developer/BrowserObjects#bookmark
Read
For syncing out as well as conflict resolution, right now we call
- getFolderIdForItem()
- getBookmarkURI()
- getKeywordForBookmark()
- and get the various "bookmarkProperties/" annotations (description, loadInSidebar, staticTitle, etc.).
I imagine that this information could be retrieved with one query which could be wrapped into getBookmarkInfoAsync() or similar API call.
Write
We currently use the following nsINavBookmarksService methods to create items:
- insertBookmark()
- createFolder()
- insertSeparator()
- as well as nsILivemarkService::createLivemark()
When we make these or a subset of these async, how will be notified of the ID (and later, when we add GUIDs, their GUID)? Through an nsINavBookmarkObserver? Or will there be a callback we can pass into? We need the ID / GUID so that we can set various "bookmarkProperties/" annotations (description, loadInSidebar, staticTitle, etc.) as well as keywords via setKeywordForBookmark(). Ideally, of course, we could follow aforementioned theme and have the insert*Async() methods allow us to pass those in as well. Then we'd only need a callback for special cases such as microsummaries and tags.
To update already existing items, we call setItemTitle(), changeBookmarkURI(), setKeywordForBookmark() and update various annations. Perhaps there could be an updateBookmarkAsync() akin to insertBookmarkAsync() to save us these various separate method calls?
Detailed Proposal
The new methods can be added to a new interface. Maybe nsIBookmarksService (and it only ever does async stuff)?
The only thing not covered here so far is livemarks. Not sure if it's worth an API for them or not (can we get data on how many people store now?).
nsIAnnotationInfo
/** * Interface that describes an annotation. */ interface nsIAnnotationInfo : nsISupports { readonly AUTF8String name; readonly nsIVariant value; readonly long flags; readonly unsigned short expiration; }
Ideally, we could use this for a future async annotation service. For now, it's just purposed for nsIBookmarkInfo.
nsIBookmarkInfo
/** * Interface that describes a bookmark. */ interface nsIBookmarkInfo : nsISupports { readonly ACString guid; readonly long long id; readonly unsigned short type; readonly long long parentId; readonly ACString parentGuid; readonly long index; readonly nsIURI uri; readonly AString keyword; readonly AString title; readonly long long frecency; /** * An array of nsIAnnotationInfo objects for the bookmark. These are item annotations from nsIAnnotationService. */ readonly nsIVariant annotations; }
The idea here is that this interface will be returned and given to places APIs when you ask for information about a bookmark, and when you want to add a bookmark. Because we have the type present, this will handle bookmarks, folders, separators, and dynamic containers (which, if I recall correctly, handles microsummaries and live bookmarks).
I included guids here, which makes this gated (possibly) on bug 607117.
Not clear to me if we need tags on here or not. Need input from the Sync team.
nsIBookmarkInfoCallback
interface nsIBookmarkInfoCallback : nsISupports { /** * Called when one of the bookmark methods is done with its work. * * @param aBookmarkInfo * The information about the bookmark, or undefined if nothing was found. */ void onComplete(in nsIBookmarkInfo aBookmarkInfo); }
This should be marked with [function] so JS consumers can just pass a function in if they want.
getBookmarkInfo
/** * Gets all information known about a bookmark. Callers must specify the id, guid, or the URI of the bookmark in question. * * @param aInfo * The bookmark info object that contains the id xor the guid xor the URI of the bookmark. * @param aCallback * The object/function to notify when we have the information about the bookmark. */ void getBookmarkInfo(in nsIBookmarkInfo aInfo, in nsIBookmarkInfoCallback aCallback);
Consumers would do something like this: bs.getBookmarkInfo({guid:"..."}, function(aInfo) { ... });
C++ consumers won't be happy, but I'm not sure I care.
insertItem
/** * Inserts a bookmark item. Required fields on nsIBookmarkInfo are: * - parentId or parentGuid * - uri * - title * Everything else is optional. * * @param aInfo * The bookmark info object that contains the data needed. * @param aCallback * The object/function to notify when we have added the bookmark. */ void insertItem(in nsIBookmarkInfo aInfo, [optional] in nsIBookmarkInfoCallback aCallback);
Need to document about everything that would make us throw. Also, how do we handle errors? Want to keep the callback simple, ideally.
insertItems
/** * Inserts many bookmark items. Just like insertItem otherwise. * * @param aInfo * The bookmark info objects that contain the data needed. * @param aCallback * The object/function to notify when we have added each bookmark. */ void insertItems([array, size_is(aLength) in nsIBookmarkInfo aInfo, in unsigned long aLength, [optional] in nsIBookmarkInfoCallback aCallback);
Just like insertBookmarkWithInfo, but takes a big array of bookmark info and does it all at once. This is basically the batch mode version. I suspect mak and I are going to debate how to best do batch mode, so this may change a lot still.
updateItem
/** * * Update the information about a bookmark item. Callers must specify the id, guid, xor the URI of the bookmark in question. * * @param aIdentifier * The bookmark info object that contains the id xor the guid xor the URI of the bookmark. * @param aInfo * The information to update about the bookmark. * @param aCallback * The object/function to notify when we have updated the information about the bookmark. */ void updateItem(in nsIBookmarkInfo aIdentifier, in nsIBookmarkInfo aInfo, [optional] in nsIBookmarkInfoCallback aCallback);
updateItems
/** * Updates many bookmark items. Just like updateItem otherwise. aIdentifiers and aInfo must have a 1:1 mapping. * * @param aIdentifiers * Array of bookmark info objects that contains the id xor the guid xor the URI of the bookmarks. * @param aInfo * Array of information to update about the bookmark. * @param aCallback * The object/function to notify when we have added each bookmark. */ void updateItems([array, size_is(aLength) in nsIBookmarkInfo aIdentifiers, [array, size_is(aLength) in nsIBookmarkInfo aInfo, in unsigned long aLength, [optional] in nsIBookmarkInfoCallback aCallback);
Notes and feedback on Proposal
No need for "plural" methods
Due to the way Sync engines work, we have no need for the batch operation methods (insertBookmarks, updateItems) as we deal with each bookmark record individually. We're running this in batchmode anyway. --philikon
We didn't want runInBatchMode to have magic flags that deal with these async methods. Shouldn't be a problem for the Sync team to deal with this new way. --sdwilsh
So the bookmarks store will just accumulate the incoming objects in memory and then the bookmarks engine, at the end of _processIncoming(), will flush the records to disk, using the async call. To avoid hogging too much memory when lots of bookmarks are involved, the store should flush after a certain number of records have accumulated as well. 50 seems like a good number since that's incidentally also the batch size for network fetches on mobile. --philikon
Types of places items covered
Just so we're on the same page, these are the data types Sync supports:
- bookmarks (TYPE_BOOKMARK)
- folders (TYPE_FOLDER)
- separators (TYPE_SEPARATOR)
- microsummary (TYPE_BOOKMARK with microsummary)
- livemark (TYPE_FOLDER that's a livemark)
- query (TYPE_BOOKMARK with a place: URI)
Note that Sync does not support "dynamic containers" (TYPE_DYNAMIC_CONTAINER). --philikon
We don't care about dynamic containers so far, mainly because the API lies and they just don't work. --mak
All these are covered, but you may have to roll your own way with annotations. --sdwilsh
To eloborate: microsummaries and livemarks are regular bookmarks and folders, respectively, with some annotations added to them. So instead of going through their APIs we're just going to add the necessary annotations. We might want to expose some of the internal constants these services use via their interfaces so that we don't have to copy & paste them. --philikon
Interfaces
The separate interface is the best bet, both for compatibility and clear sync/async separation. Name could either be nsIBookmarksService (drop ns?) or IAsyncBookmarks. --mak
Don't have a strong preference. I say just defer to sr on this. --sdwilsh
Dietrich suggested to just do a AsyncBookmarks.jsm module, and completely drop xpcom. I mostly agree, this interface is not going to have cpp users, and we are also making it hard for cpp users (the arrays and so on...). We'd just need some private interface in nsPIPlacesDatabase to forward notifications. What would block us from doing that? --mak
Parent Name
Sync seem to use parentName for reconciliation, but does not look like something that should be part of the API, both the parent and the name can change easily and having 2 volatile informations seem fragile. On the other side caching parent names on the fly in a local hash is probably as efficient. --mak
I think the whole setup is a bit volatile. We should see what the Sync team says about it and if they care that they might have to do a second query to get the actual name. It's not hard to add it if need be, it just feels awkward. --sdwilsh
Having the parent name in there feels awkward but it would make our code much easier while making the places code only slightly more complicated (I hope). I will have to think about alternatives to our current duping algorithm, but there's probably little alternative to having it match folder names. --philikon
Decision: keep the parentName. --sdwilsh
Predecessor
This is similar to the parent name issue: In order to preserve ordering, Sync maintains the GUID of an item's predecessor in each record. So it'd be good if nsIBookmarkInfo were to contain that info. Bonus points for making updateItem
understand this and move the item appropriately. --philikon
We'd like to just keep it at itemIndex, due to more complicated queries or more queries. philikon says it's possible, but will follow up. --sdwilsh
It's possible but all options I can think of will require serious change:
- We could make bookmarks have the itemIndex instead of predecessorid. However, adding a new bookmark in a folder or moving a bookmark would invalidate a lot of other bookmarks, potentially even all, in that folder.
- We could store the predecessor GUID in annotations since we now have async APIs for querying and storing these. But we would have to ensure that this annotation exists on all bookmarks first which would mean a lot of I/O on first sync, pretty much exactly the same kind of I/O that we have to do for GUIDs right now.
- Instead of maintaining the ordering on the bookmarks, maintain it on their folders. Inserting or moving a bookmark would invalidate one more object, its folder (as opposed to its successor as it does now). This would mean having a childrenGUIDs array on nsIBookmarkInfo that would get/set the order of a folder's children.
Both option 1 and 3 would mean a version bump for the bookmark storage. These are similarly problematic as global storage version bumps. Fortunately, we're doing one very very soon, so we could just move to the new bookmark storage version now so that it's in place when we get the async APIs. I'm personally leaning towards option 3 --philikon
Changing bookmarks table to drop position could not be that easy, I have to check, but iirc we rely pretty much on that positioning and on it not having "holes". So if that's the idea, we have to be cautious here, if the "bookmark storage" is something on Sync side, then nevermind! --mak
Yes, "bookmark storage" is a Sync component. --philikon
Alternate proposal: Sync adds an annotation on the folder containing the list of guids (in order). This results in no extra API frontage and solves all current issues. --sdwilsh
Tags
nsIBookmarkInfo should most likely include tags or we need a way to set tags async. It's indeed impossible to create a bookmark async and tag it synchronously. it could be possible to add tags in the bookmarks added notification, but is it worth it? --mak
I thought about this, but I'm pretty sure that because we store tags as a bookmark in a different root, Sync will just work magically anyway. We should confirm this with them though. I'd rather not add it yet if we can avoid it because this is already a lot of work. --sdwilsh
We explicitly ignore the tags folder right now, so it doesn't magically work. Sync uses the tagging service API explicitly. So either we should perhaps start syncing the tags folder (if that does indeed magically work) or nsIBookmarkInfo should contain the tags. --philikon
OK, we could add a getBookmarkInfoWithTags (or have another argument to getBookmarkInfo) that could obtain tags too. Syncing the bookmarg tags folder wouldn't work out right for things like the iphone client. --sdwilsh
Take in count in future tags will have to move out of bookmarks folders/table to their own storage, thus it is not sane to request syncs of the tags root. What Sync does using taggign service is correct. Not working here though since it's sync. What's the cost of just returning tags always? If I use this interface is pretty clear I want to know EVERYTHING on a bookmark. --mak
nsIBookmarkInfo readonly
getBookmarkInfo takes a nsIBookmarkInfo object, thus we probably don't want all of its properties being readonly. --mak
It shouldn't matter. We don't have to return the same nsIBookmarkInfo object that we are passed in (in fact, for threadsafety reasons, we shouldn't!) --sdwilsh
Query by URI
Notice that while passing an id or guid will return a single bookmark, passing an uri is not guaranteed to do so. the same uri can have multiple bookmarks. what to do here? either don't allow by uri, or return last modified bookmark (that is what we do today) --mak
Last modified probably works fine, but we should check with the Sync team. I suspect they might actually care about all cases, in which case, we need to change our approach a bit. --sdwilsh
Pretty sure we don't have any need for querying bookmarks by URI. --philikon
Let's just drop support for querying on URI. --sdwilsh
I guess mobile could find querying by uri useful instead... but we can fix this later. --mak
Missing attributes ignored on update
updatebookmarkWithInfo same as above for name. Actually how to handle missing information in the info object? Does the implementer have to collect all info, change and then submit? Or things that should not be changed must be null? How to set a null value then? Having to collect info to be able to change them is going to be more expensive than changing them one by one. --mak
How about updateItem then? The implementer should only have to pass in the id or the guid. I envisioned them also being able to specify only the parts that changed:
updateItem({id:2}, {guid:"GUID_HERE"});
This would set the guid on item with id of 2 to "GUID_HERE". Easy to implement because things will just throw NS_ERROR_NOT_IMPLEMENTED. I hope this doesn't get logged to the error console though... --sdwilsh
Yes, it would be great if missing properties on the nsIBookmarkInfo object would simply be ignored in the update. The same should go for annotations. This means that this API wouldn't allow us to remove annotations, but that's ok since we have no need for it. But we do avoid having to first get all the information before updating it. --philikon
This shouldn't be a problem. We can add an API in the future for removal of data, but we do not need to in this cycle (2.0). --sdwilsh
History
Input History
Turns out we want to sync input history too, see bug 597874. It would be great if we could add the following to nsIPlaceInfo
:
/** * An array of nsIInputHistory objects for the place. */ readonly nsIVariant inputs;
with nsIInputHistory
:
interface nsIInputHistory : nsISupports { ACString input; double use_count; }