Services/KeyValueStorage: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 1: Line 1:
==Planning Questions==
== Planning Questions ==


How simple can we get away with while still providing useful functionality?
How simple can we get away with while still providing useful functionality?  
* maximum key size, value size?
* key => single value?
* key => set of values?  (like e.g. riak with siblings enabled)
* key + column => value?  (like e.g. bigtable or cassandra)
* keys in sorted order?  (i.e. hash or btree?)
* handling of concurrent edits, conflicts?
* bulk insert, update or delete operations?


Crypto
*maximum key size, value size?
* can it sensibly be done at this layer, or do we need to defer to the application?
*key => single value?
* encrypted keys would make key-ordering useless
*key => set of values? (like e.g. riak with siblings enabled)
*key + column => value? (like e.g. bigtable or cassandra)
*keys in sorted order? (i.e. hash or btree?)
*handling of concurrent edits, conflicts?  
*bulk insert, update or delete operations?


Authentication to App
Crypto
* can [[Services/AppKeys|AppKeys]] be used for authentication, e.g. some sort of request signing with the app key?


We will probably want some sort of partitioning or "buckets".
*can it sensibly be done at this layer, or do we need to defer to the application?  
* AppKey => list of buckets?
*encrypted keys would make key-ordering useless
* AppKey + UserID => list of buckets?
* can a bucket be shared between multiple apps? multiple users?


It would be good to find and isolate some use cases in the existing Services apps.
Authentication to App
* Build a SyncStorage plugin that stores data in the KVStore?


*can [[Services/AppKeys|AppKeys]] be used for authentication, e.g. some sort of request signing with the app key?


Management features
We will probably want some sort of partitioning or "buckets".
* built-in quota system? Will be more efficient than simulating it at a higher level.
 
*AppKey => list of buckets?
*AppKey + UserID => list of buckets?
*can a bucket be shared between multiple apps? multiple users?
 
It would be good to find and isolate some use cases in the existing Services apps.
 
*Build a SyncStorage plugin that stores data in the KVStore?
 
<br> Management features  
 
*built-in quota system? Will be more efficient than simulating it at a higher level.
 
== Strawman Proposal  ==
 
==== Level 0:&nbsp; Basic Key-Value Storage<br>  ====
 
This is the most primitive of all the functionality - a simple map from keys to values.&nbsp; Nothing fancy, but can be hard to work with in a distributed environment.<br>
 
===== Python API<br>  =====
<pre>    bucket.put("my key","my exciting value")
 
    bucket.get("my key")
    =&gt;&nbsp;"my exciting value"
 
    bucket.delete("my key")
 
    bucket.get("my key")
    =&gt;&nbsp;KeyError
</pre>
===== HTTP&nbsp;API<br>  =====
<pre>    PUT /appkey/bucketname/items/my%20key
    Content-Length: 17
    my exciting value
 
 
    GET /appkey/bucketname/items/my%20key
    =&gt;&nbsp; 200 OK
        Content-Length:&nbsp;17
        my exciting value
 
    GET /appkey/bucketname/items/otherkey
    =&gt;  404 Not Found
</pre>
==== Level 1: Atomic Compare-and-Swap <br>  ====
 
Well, as atomic as is reasonable taking into account e.g. vector clocks etc.&nbsp; This is provided in lieu of transactions so that the application can check that it's not e.g. deleting updates made by another process.<br>
 
===== Python API  =====
<pre>    item = bucket.getitem("my key")
    item.value    # the value previously stored
    item.version  # opqaue version identifier
 
    bucket.put("my key", "new value", ifmatch="WRONGVERSION")
    =&gt;&nbsp; VersionMismatchError
 
    bucket.put("my key", "new value", item.version)
    =&gt;&nbsp; OK
</pre>
 
 
===== HTTP API  =====
<pre>    GET /appkey/bucketname/items/my%20key
    =&gt;&nbsp; 200 OK
        Content-Length:&nbsp;17
        X-Weave-Version:&nbsp;XXXYYY
        my exciting value
 
    PUT /appkey/bucketname/items/my%20key
    X-Weave-If-Match:&nbsp; WRONGVALUE
    Content-Length: 2
    Hi
    =&gt;&nbsp; 412 Precondition Failed
 
    PUT /appkey/bucketname/items/my%20key
    X-Weave-If-Match:&nbsp; XXXYYY
    Content-Length:&nbsp;2
    Hi
    =&gt;&nbsp; 204 No Content
</pre>
We could treat the version number like etag and use standard HTTP&nbsp;headers for it, but I haven't check if this would violate anything in the RFC.&nbsp; Riak uses a custom header for its vclock thingo.<br>

Revision as of 08:34, 13 September 2011

Planning Questions

How simple can we get away with while still providing useful functionality?

  • maximum key size, value size?
  • key => single value?
  • key => set of values? (like e.g. riak with siblings enabled)
  • key + column => value? (like e.g. bigtable or cassandra)
  • keys in sorted order? (i.e. hash or btree?)
  • handling of concurrent edits, conflicts?
  • bulk insert, update or delete operations?

Crypto

  • can it sensibly be done at this layer, or do we need to defer to the application?
  • encrypted keys would make key-ordering useless

Authentication to App

  • can AppKeys be used for authentication, e.g. some sort of request signing with the app key?

We will probably want some sort of partitioning or "buckets".

  • AppKey => list of buckets?
  • AppKey + UserID => list of buckets?
  • can a bucket be shared between multiple apps? multiple users?

It would be good to find and isolate some use cases in the existing Services apps.

  • Build a SyncStorage plugin that stores data in the KVStore?


Management features

  • built-in quota system? Will be more efficient than simulating it at a higher level.

Strawman Proposal

Level 0:  Basic Key-Value Storage

This is the most primitive of all the functionality - a simple map from keys to values.  Nothing fancy, but can be hard to work with in a distributed environment.

Python API
    bucket.put("my key","my exciting value")

    bucket.get("my key")
    => "my exciting value"

    bucket.delete("my key")

    bucket.get("my key")
    => KeyError
HTTP API
    PUT /appkey/bucketname/items/my%20key
    Content-Length: 17
    my exciting value


    GET /appkey/bucketname/items/my%20key
    =>  200 OK
        Content-Length: 17
        my exciting value

    GET /appkey/bucketname/items/otherkey
    =>  404 Not Found

Level 1: Atomic Compare-and-Swap

Well, as atomic as is reasonable taking into account e.g. vector clocks etc.  This is provided in lieu of transactions so that the application can check that it's not e.g. deleting updates made by another process.

Python API
    item = bucket.getitem("my key")
    item.value    # the value previously stored
    item.version  # opqaue version identifier

    bucket.put("my key", "new value", ifmatch="WRONGVERSION")
    =>  VersionMismatchError

    bucket.put("my key", "new value", item.version)
    =>  OK


HTTP API
    GET /appkey/bucketname/items/my%20key
    =>  200 OK
        Content-Length: 17
        X-Weave-Version: XXXYYY
        my exciting value

    PUT /appkey/bucketname/items/my%20key
    X-Weave-If-Match:  WRONGVALUE
    Content-Length: 2
    Hi
    =>  412 Precondition Failed

    PUT /appkey/bucketname/items/my%20key
    X-Weave-If-Match:  XXXYYY
    Content-Length: 2
    Hi
    =>  204 No Content

We could treat the version number like etag and use standard HTTP headers for it, but I haven't check if this would violate anything in the RFC.  Riak uses a custom header for its vclock thingo.