Changes

Jump to: navigation, search

Identity/AttachedServices/StorageServiceArchitecture

491 bytes added, 06:08, 11 June 2013
Types of Cluster
== Types of Cluster ==
We'll likely start with a single cluster into which all users are assigned. But here are some ideas for how we could implement different types of cluster with different performance, costcosts, tradeoffs, etc. 
=== Massively-Shared MySQL ===
There's a commercial software product called "ScaleBase" that implements much of this functionality off the shelf. We should start there, but keep in mind the possibility of a custom dbrouter process.
 
'''Pros''': Well-known and well-understood technology. No-one ever got fired for choosing MySQL.
'''Cons''': Lots of moving parts. MySQL may not be very friendly to our write-heavy performance profile.
=== Cassandra Cluster ===
 
Another promising storage option is Cassandra. It provides a rich-enough data model and automatic cluster management, at the cost of eventual consistency and the vague fear that it will try to do something "clever" when you really don't want it to. To get strong consistency back, we'd use a locking layer such as Zookeeper or memcached.
* The webheads also have a shared ZooKeeper or memcached install, which they use to serialize operations on a per-user basis
* Cassandra is periodically snapshotted into S3 for extra durability.
 
'''Pros''': Easy management and scalability. Very friendly to write-heavy workloads.
'''Cons''': Unknown and untrusted. Harder to hire expertise. Eventual consistency scares me.
 
=== Hibernation Cluster ===
If they come back and try to use their data again, we immediately trigger a migration back to one of the active clusters.
 
'''Pros''': Massive cost savings.
'''Cons''': Have to actually monitor usage and implement this.
== Things To Think About ==
Confirm
358
edits

Navigation menu