Places:BookmarksComments

From MozillaWiki
Jump to: navigation, search

NOTE: This is currently in draft. Please put commentary and critique on the discussion page. Also, I'm still learning the particulars of Places, so please feel free to correct any gross errors or mis-representations :)

Goals

  • Determine whether the existing data model is the right approach for bookmarks for Firefox 3.
  • Determine the requirements for sync, and other types of bookmarks integration, and define the changes needed to meet those requirements.

Approach

  • Make as few changes as are necessary to the core to achieve our goals for Firefox 3.
  • Don't create a bunch of stop-energy up front by rebuilding Places from scratch. Instead we should aggressively refactor as necessary, keeping our goal of a short development cycle in mind.
  • "Perfect is the enemy of good" (Er, sorry for repeating the new Mozilla dev motto.)

General Requirements

  • Create a solid and flexible bookmarks data model to build on for Firefox 3.
  • Reduce complexity, relative to the old bookmarks code (reducing bug introduction risk and bug time-to-close, attracting community developers)
  • Improved APIs for add-ons and partners: Provide flexibility and ease-of-use for common bookmark integration scenarios.

Some Problems with Places Bookmarks

The Singleton Model

  • The Places data model is centered around the originating URI. Each URI encountered via history or bookmarks has a single entry in the moz_history table.
  • Bookmarks are basically pointers to that history entry, with a few properties to determine their placement in the bookmarks folder hierarchy. The design is such that most properties of a bookmark are derived from the moz_history entry, and not the bookmark.
  • This is problematic as soon as the same URI is bookmarked more than once. Eg: If you change a bookmark's title, then the title changes anywhere you've bookmarked that URI. Microsummaries and livemarks are examples of valid use-cases for bookmarking the same URI more than once, for different purposes.
  • I believe there's a patch, or was a plan, to handle bookmark titles as a special-case. However, if one of a bookmarks primary properties has to be special-cased, it's likely we'll continue to encounter this problem again moving forward.

Sync and Integration Problems

There's a serious need to provide partners and add-on developers better tools for bookmarks integration. Sync in particular has special needs. Some basic requirements for sync:

  • Must be able to uniquely identify each object across different datastores, this includes bookmarks, folders and separators.
  • Must be able to determine if anything in the datastore has changed from an arbitrary point.
  • Must be able to get all changed bookmarks from an arbitrary point.
  • Must be able to access a bookmark's full and complete state in order to replicate that exact view of the bookmark in another profile.
    • Looking forward, we should include Microsummary data and any related annotations.

Some of these requirements are not really possible using the existing data model, and some are just difficult.

Places URIs

  • Places currently has a mechanism for generating URIs for bookmark folders and livemarks.
  • These URIs are then annotated via the annotation service for things like livemarks.
  • However, some of these URIs are mutable: If you move a folder's position, or move it to another folder, the URI changes. At that point, any annotations of the old URI must be deleted, and re-added with the new URI.
  • This is a fragile approach: any modification of stored items invalidates existing URIs for those items that may be stored elsewhere - annotations, extensions, etc.
  • It's database- and code-heavy: modifications of stored items requires checking and updating all instances of internal usage of place: URIs, and all the accompanying add/update/delete queries. eg: if you move a folder to a position below it's next sibling, all annotations of that folder must be updated (i'm not sure if the current impl deletes and re-adds, or just updates).
  • There's no way to annotate a *bookmark* distinct from the bookmarked URI.

Recommendations

Summary

All bookmarks, folders and separators in the bookmarks datastore...

  • Should have a unique identifier to be used internally instead of the moz_places row id.
  • Should have a GUID, to be used in Places URIs for annotations and external global identification.
  • Should have a version number that increments with each change.

Also:

  • Extended bookmark properties such as microsummaries should be annotations of the bookmark URI instead of the original URI

This limited set of changes should solve, or at least mitigate the problems listed above, while still providing the benefits of tighter integration between history and bookmarks.

My conclusion is that the Places bookmarks framework is viable given the changes detailed above, and will allow us to provide both the traditional bookmarks functionality in Firefox 3, as well as innovative new interactions and presentation of that data. However, it's specialization on the bookmark metaphor seems limiting when confronting a future that consists of unlimited variation in structured metadata, and it's likely to be an interim step to, or will exist parallel to a proper structured metadata platform in Mozilla 2.

Discussion of implementation details follows.

Identity and URIs

  • Singleton model: At a minimum, we should use sqlite auto-inc PK IDs for bookmark identification internally in order to disambiguate the basic issues of the same URI being bookmarked in different locations in the bookmarks hierarchy. Using the bookmarked URI as a singleton leads us down a road of endless "special-cases" as discussed previously.
  • Global Uniqueness: However, the issue of identity expands beyond the repercussions of the singleton model. Auto-incrementing IDs are not globally unique, and cannot be used for identification outside of the local bookmark set. We should use GUIDs for identification, which will provide local and global disambiguation of bookmarks.
  • Places URIs: Places URIs should be used as identifiers for external use, and internally, when using annotations as an extensibility mechanism. The existing Places code does this when annotating livemarks, folders and queries. We should add APIs to generate URIs for bookmarks and separators, and update folder URIs to use a GUID instead of the auto-incrementing ID. The mutability of Places URIs means they're fragile, high-maintenance and not persistent, which is a requirement for global use. How should we construct the URI? place:{GUID}? Should we include type, eg: bookmark, folder, livemark, tag? place:{type}:{GUID}?

Versioning

  • The datastore should have a global revision number, which is incremented anytime anything in it changes.
  • Each object should have a revision number, which increments each time the object is modified.
  • Each object stores the global revision number's value at the time of the modification. This provides the ability to get granular change-sets:
    • "give me all items changed since global revision 328"
    • "give me bookmark foo if revision is > 92"
  • This style of versioning is relatively simple to implement, and doesn't radically increase the overhead or complexity of local bookmark operations.
  • Could the bookmarks root's revision number act as the global change flag?
  • Do changes bubble up? Eg: If a container's children change, does that up the version of the container?

Extensibility

These things are outside the immediate scope of bookmarks, but we should keep them in mind:

  • We should be able to sync Microsummaries, tags, Microformats or annotation data for that matter. Foxmarks recently added support for Microformats. As we move forward we should make sure that new features are rolled into our query or export APIs to make things easy for these kinds of consumers.
  • We can assume that new metadata and data-transfer formats will continue to show up, so we should have a mechanism for introducing and supporting structured metadata. This is likely a Mozilla 2 issue.
  • Looking forward: Say we've tagged a URI, and want to annotate that tag with occurrence data, rating, etc. Should that tag have it's own URI for use in annotations? Heh, we've come full-circle back to RDF at this point :)

Places ERD

This diagram reflects the following naming changes:

  • rename moz_history to moz_places
  • standardize on pluralized table names
  • standardize on naming PKs "id", and FKs "{table name}_id"
  • rename moz_anno_names to moz_anno_attributes

and the following architectural changes:

  • add id column to moz_bookmarks, auto-incrementing PK
  • add GUID column to moz_bookmarks, moz_bookmarks_folders
  • add revision column to moz_boomarks
  • add global_revision to moz_bookmarks
  • move moz_places.user_title to moz_bookmarks.title

Places ERD

Tasks

  • Places schema needs naming and consistency improvements
    • Change file name to places.sqlite
    • Update the db table initialization code to reflect the schema changes listed in the ERD section
    • Update all queries to reflect the schema changes
    • Update interfaces, implementors and callers affected by the schema changes
  • Bookmarks must be internally uniquely identifiable
    • add auto-incrementing ID col to moz_bookmarks
    • make interface changes to support bookmark IDs instead of place IDs
    • update component and caller code to use bookmark IDs instead of place IDs
  • Bookmarks must be externally uniquely identifiable
    • Add GUID column to moz_bookmarks, moz_bookmarks_folders
    • Add the ability for bookmarks and separators to have Places URIs
    • Change the place: URI generation code to use GUIDs for non-dynamic objects
  • Bookmarks, Folders and Separators should have simple versioning scheme
    • Add revision column to moz_bookmarks, moz_bookmark_folders
    • Add the revision management code to all insert/update operations on bookmarks, folders and separators.

Open Questions

  • Can the revision number of the bookmarks root folder should be used as the global change identifier?
  • Given an added level of indirection between bookmarks and their bookmarked URI, need to determine what changes are required for that URI accessible in query results, for example.

TODO

  • What are the pros/cons of the hierarchy storage algorithm in moz_bookmarks? It may be desirable to add a type col (eg: folder/bookmark/separator), collapse the the folder and bookmarks tables, and move to MPTT or another method of storing hierarchal relationships in SQL.
    • Todd Agulnick recommended a simple single-table approach in his newsgroup post: consolidate into a single table, specify type and parent ID. This removes the need for the item_child/folder_child distinction, as well as the 2 extra tables for managing folders. Need to figure out what the common use-case queries look like, and how performant they are.
    • Sayrer pointed to this article on modeling hierarchies in relational DBs using MPTT: http://www.sitepoint.com/article/hierarchical-data-database/2
  • The changes here are to provide a foundation on which we can build good sync and query APIs. The specifics of these APIs need to be defined and implemented.