We'll replicate the entire stack into several data-centers, each of which will maintain a full copy of all shards.
One DC will be the active master for each shard. All reads and writes for that shard will be forwarded into that DC and routed to the master. (This will save us a ''world of pain'' by not having multiple conflicting writes going into different DCs). Other DCs are designated as warm-standby hosts for that shard, configured for asynchronous WAN replication. They can be failed-over to if there is a serious outage in the master DC, but this will almost certainly result in the loss of some recent transactions:
+----------------------------------------------------------------------------------+