Changes

Jump to: navigation, search

Identity/AttachedServices/StorageServiceArchitecture

126 bytes removed, 00:42, 1 May 2013
Things To Think About
* There's only a single active DC which has to handle all traffic. That's the price we pay for using MySQL and exposing a strongly-consistent client API.
** Since this is not a user-facing APIWe ''could'' have multiple DCs active and serving web traffic, I think this is a good trade-offrouting read queries to the local replica and write queries over to the proper master. We don't care quite as much about Seems like an unnecessary pain and expense though, esp. with the perceived latency and responsiveness possibility of these requestslosing read-your-own-writes consistency.
** It's not like that DC is going to run out of capacity, right?
** We ''could'' have multiple DCs active and serving web trafficSince this is not a user-facing API, and route the individual DB queries to the proper masterI think this is overall a good trade-off. Seems like an unnecessary pain and expense though. * ThereWe don's t care quite as much about the potential for severe cross-DC perceived latency if you receive a HTTP request in one data-centerand responsiveness of these requests, but have to forward all the MySQL queries over to the master in another data-center. I don't think there's need location-based routing or any way around this without going to an eventually-consistent model, which would complicate the client APIsuch fanciness.
* There's a lot of redundancy here, which will cost a lot to run. Are our uptime requirements really so tight that we need a warm-standby in a separate DC? Could we get away with just the hot standby and periodic database dumps into S3, with which we can (slowly) recover from meteor-hit-the-data-center scale emergencies?
* How will Needs a detailed and careful plan for how we cope with moving 'll bring up new DBs for existing shards , how we'll move dshards between database hostsDBs, or replacing dead hosts with fresh machines and how we'll split shards if that have to catch up to the masterbecomes necessary. Checkpointing for faster recovery?All very doable, just fiddly.
Confirm
358
edits

Navigation menu