For this scheme to work, every Shard Proxy process in every DC needs to agree on where the master is for every shard. That's not trivial. But it should be pretty doable with some replication between the respective ZooKeeper masters in each DC. Remember: this data is small and changes infrequently.
We would sould probably not try to automate whole-DC failover, so that Ops can ensure consistent state before anything tries to send writes to a new location.
'''TODO:''' How many DCs? The principle should be the same regardless of how many we have. Nested Star Topology FTW.
== Things To Think About ==
I've tried to strike a balance between operational simplicity, application simplicity, and functionality here. We pay a price for itthough:
* There's quite a few moving parts here. Particularly the ZooKeeper is a beast. The proxy process has a few different, interacting responsibilitiesthat would have to be carefully modeled and managed.
* There's the potential for severe cross-DC latency if you receive a HTTP request in one data-center, but have to forward all the MySQL queries over to the master in another data-center. I don't think there's any way around this without going to an eventually-consistent model, which will would complicate the client API.
* There's a lot of redundancy here, which will cost a lot to run. Are our uptime requirements really so tight that we really need a warm-standby in a separate DC? Could we get away with just the hot standby and periodic database dumps into S3, with which we can (slowly) recover from meteor-hit-the-data-center scale emergencies? * How will we cope with moving shards between database hosts, or replacing dead hosts with fresh machines that have to catch up to the master. Checkpointing for faster recovery?