From MozillaWiki
Jump to: navigation, search


This document gives a little background about couchdb's view concept and the raindrop implementation of a 'mega-view' which allows most of our query requirements to be expressed in a single view.

Introduction to CouchDB's views

Couchdb implements 'views' (usually called 'queries' in other databases) somewhat differently than most databases. Each view defines a 'map' and an optional 'reduce' function which defines the view. Couchdb calls this map function once for every document in the database and once the result for a document is known, that result is stored with the database in a view index. Subsequent queries which match this document do not cause this map function to be re-executed for that document - the previous result is found and used. For more information, see the couchdb wiki pages

The end result of this scheme is that performing queries on a database are very fast - nothing needs to be dynamically queried as the results are already in the view index. However, it also has a number of limitations, such as:

  • Documents are indexed as the first query is made on them. If many documents have been added to couchdb since the last query for a particular view, that view needs to execute the javascript function over each of those documents. This process is relatively slow, particularly when the number of documents is large.
  • As a special case of the above, if you ever need to modify an existing view or introduce a new view, the view index needs to be built for every already document in the database. As above, this can be significant with large document counts.
  • Ad-hoc querying is difficult. If you need to make a query that is not expressed in one of your existing views, you need to define a new view. If your database already has many documents, you hit the issues described above.
  • Because the keys and values emitted by the javascript function are stored with the DB, the disk-space required becomes significant. In many cases the views are simply emitting fields which already exist in the document, causing those fields to be stored multiple times.
  • The 'map' function can only look at a single document; the concept of a 'join' simply does not exist in couchdb.

While couch's indexing performance increases all the time, a database with 100,000 documents can still take many many minutes to index a single view. During this time, couch will make heavy use of the CPU, and all other queries for that view will be blocked until the index is rebuilt. While couch does offer the ability to return 'stale' data (ie, ignoring those new documents), some users of these views must have the correct results for them to function correctly. If an index fundamental to the operation of the back or front ends needs rebuilding, raindrop is effectively unavailable for use until those indexes are rebuilt.

Raindrop uses a model which results in a number of small documents existing in the couch for each message - in other words, raindrop is fairly 'document heavy'. As a result, it tends to feel the above problems fairly acutely.

The challenges for raindrop

The particular concerns for raindrop relate to the desired extensibility of raindrop (ie, the 'everything is an extension' model). Even if the raindrop developers were smart enough to author views which meet the specific requirements of the raindrop core without needing it to be modified in the future (ha!), this doesn't help extensions which are yet to be written.

Many back-end extensions need to create new schema definitions, and they emit schemas of that type. For example, the youtubed extension has invented a new schema with metadata about any youtube video links found in messages. It is reasonable to assume that some extensions will then have the requirement of performing queries against this data (eg, show me all messages with youtube links; all messages which reference a particular video; etc). If we require each extension to define the views it needs, then when a new extension is introduced it may be many minutes before the extension can actually do anything while its views are being built. In the meantime, not only is the user unable to interact with the new extension, their computer is suddenly working very vary hard for no apparent reason. Managing the user's expectations in this scenario will be tricky.

Enter the megaview

Early iterations of raindrop used 'normal' custom views but ran into the problems described above. After some iterations, we came up with a concept we call the 'mega-view'.

The mega-view pushes the couchdb view model to its limits in an attempt to overcome these problems, and in an attempt to never (or at least very rarely) rebuild the megaview. In summary:

  • Every document in the database emits multiple key/values pairs for the index. The key is always a list of 3 elements, while the value is identical for every key emitted by a single document.
  • The key is made up of a list of [schema_id, field_name, field_value]. Thus, every field in the document has the field value emitted as part of the key. Note however that long strings and attachments are not emitted (in the case of long strings, a null is emitted as the value).
  • Additional meta-data about each document, such as the 'raindrop key' and the 'extension id' which created the document, is also emitted for each document. However, this meta-data is always emitted using the exact same [schema_id, field_name, field_value] format, but using the pseudo-schema called 'rd.core.content'.
  • Raindrop takes advantage of couchdb's 'view collation' semantics so that in some cases, only portions of the emitted key are matched. For example, if all you need to know is what documents have a particular field regardless of its value, you can query with startkey=[schema_id, field_name] and endkey=[schema_id, field_name, {}] - see the couchdb wiki pages on view collation for more details.

For example, consider the following document representing the rd.msg.body schema for a tweet:

  "_id": "rc!tweet.NDg1NzA3NzQ3NQ==!rd.ext.core.msg-tweet-to-common!rd.msg.body",
  "_rev": "1-f2fbb8b6461ff3707aaa817ccf1a2442",
  "body": "Relaxville - a CouchDB test suite report browser with social replication, that uses Mustache.js and CouchApp http://bit.ly/4tAlv2",
  "rd_schema_id": "rd.msg.body",
  "rd_megaview_expandable": ["to", "to_display", "cc", "cc_display"],
  "timestamp": 1255505465,
  "from_display": "J Chris Anderson",
  "body_preview": "Relaxville - a CouchDB test suite report browser with social replication, that uses Mustache.js and CouchApp http://bit.ly/4tAlv2",
  "rd_key": [ "tweet", 4857077475 ],
  "rd_source": [ "rc!tweet.NDg1NzA3NzQ3NQ==!proto.twitter!rd.msg.tweet.raw", "1-81184153373e2352719660f7e89de2ad" ],
  "from": [ "twitter", "jchris" ],
  "rd_ext_id": "rd.ext.core.msg-tweet-to-common"

This will cause the following keys to be emitted:

  • ['rd.msg.body', 'body', null], // long string has been emitted as null
  • ['rd.msg.body', 'body_preview', null], // as above
  • ['rd.msg.body', 'timestamp', 1255505465],
  • etc for the 'from' fields etc (but see below re #Value_expansion)
  • ['rd.core.content', 'key', [ "tweet", 4857077475 ]]
  • ['rd.core.content', 'schema_id', 'rd.msg.body']
  • ['rd.core.content', 'key-schema_id', [[ "tweet", 4857077475 ], 'rd.msg.body']]
  • etc for some other rd.core.content items

The value emitted by the view

As the field values themselves are already emitted as part of the key, there is no need to emit that as part of the query 'value'. As a result, a single document always emits the same value for every key, which has meta-data about the document itself. At time of writing, this value is a json object with the fields '_rev', 'rd_key', 'rd_ext', 'rd_source' and 'rd_schema_id' (note that couchdb itself already makes the '_id' field available.) For example, the above document would always emit the following value for each of the keys:

  "_rev": "1-f2fbb8b6461ff3707aaa817ccf1a2442",
  "rd_key": [ "tweet",4857077475 ],
  "rd_ext": "rd.ext.core.msg-tweet-to-common",
  "rd_schema_id": "rd.msg.body",
  "rd_source": ["rc!tweet.NDg1NzA3NzQ3NQ==!proto.twitter!rd.msg.tweet.raw","1-81184153373e2352719660f7e89de2ad"]

Value expansion

Some field values are actually lists of values - for example, the 'to' field in an 'rd.msg.body' schema is defined as a list of identity IDs. With the treatment described above, the entire list would be emitted as the third element of the key, which would make it impossible to efficiently search for a value inside that list. However, not all lists are able to be treated this way - for example, an identity ID itself is a list (of [identity_type, identity_value]). So some lists we need treated as a simple value, while other lists need to be treated as an array of simple values.

As a result, there is a concept of certain fields being 'expandable'. While this is technically an attribute of the schema itself, each document is allowed to have a special field called 'rd_megaview_expandable'. This is expected to be a list of field names, and if a field is in this list, each value in the list gets emitted as a single key.

You can see this in the example document above - the rd_megaview_expandable' field is declaring that 4 fields in the schema need this special treatment. For example, an email message may have a body schema which looks something like:

  "rd_key": ["email", "somemessageid@somewhere"],
  "to": [["email", "mhammond@somewhere"], ["email", "someoneelse@gmail.com"]]
  "rd_megaview_expandable": ["to", "to_display", "cc", "cc_display"],

Note how the 'to' field is a list of identity IDs and that is also appears in 'rd_megaview_expandable'. Given this, the document would cause the following keys to be emitted for the 'to' field:

  • ["rd.msg.body", "to", ["email", "mhammond@somewhere"]]
  • ["rd.msg.body", "to", ["email", "someoneelse@gmail.com"]]

Obviously, if 4 items were in the 'to' list, four keys relating to that field would be emitted.

While technically the 'rd_megaview_expandable' field is part of the schema definition rather than specific schema instances, the field value is written to each document. In other words, all documents holding a particular schema ID should have identical 'rd_megaview_expandable' field values. The populating of this field is handled automatically when extensions emit a schema (although for this to be truly integrated we need to move schema definitions themselves into couch documents, as described in Raindrop/BackEndRoadmap#Formal_Schema_Definitions).

However, even this schema has drawbacks - the separation of 'to' and 'to_display' fields etc is a direct result of this - things would not work correctly if we attempted to roll the display name portion in the same list as the identity IDs.

Erlang megaview

There are currently two versions of the megaview - one in javascript and the other in erlang. The only advantage of the erlang one over the javascript one is speed - the erlang is more than twice as fast as javascript due to the removal of the erlang -> javascript interprocess communication. The erlang one has some downsides though, including:

  • erlang is less likely to be already understood by developers making the erlang megaview far more complex to understand than the equivalent javascript version.
  • potential security holes: while the javascript engine used by couch can't access the file-system etc, erlang implemented views can. If someone was to maliciously post a modification to the erlang view source code they could do all sorts of nasty things which are beyond the reach of javascript. That being said though, malicious code in javascript could still wreak havoc, such as DOS attacks, leakage of private information (javascript can post data to other URLs), etc, so future consideration needs to be given to the security of view source code regardless of implementation language.

Problems with the megaview

There are a number of practical problems with the megaview. In no particular order:

  • The megaview is slow; around 20 times slower than the most trivial view implementation (eg, emit(null, doc._rev))
  • Huge amount of disk-space. For example, if I look at my current raindrop database, the database itself is 77MB, the view index is 664MB.
  • Complex queries can't be performed. For example, it would be trivial for a custom view to emit two separate fields as a key value, whereas that can not be expressed in the megaview. Similarly, a view which emitted each word in a field as a key would be trivial with a custom view but can not be expressed by the megaview.