Community Ops/paas/Backups: Difference between revisions
< Community Ops | paas
Jump to navigation
Jump to search
(Created page with "=Infrastructure backup= ==Assumptions== * The majority of our infrastructure is based in the idea to have as many immutable parts as possible. ** Docker images ** Marathon dep...") |
|||
| Line 34: | Line 34: | ||
* Deploy it in chronos | * Deploy it in chronos | ||
* Schedule policy | * Schedule policy | ||
* 7 times a week | ** 7 times a week | ||
* Lives in S3 | *** Lives in S3 | ||
* 4 times per month | ** 4 times per month | ||
* Lives in Glacier | *** Lives in Glacier | ||
* 12 times per year | ** 12 times per year | ||
* Lives in Glacier | *** Lives in Glacier | ||
=== Marathon/Chronos definitions === | === Marathon/Chronos definitions === | ||
* Backup is going to live in a versioned S3 bucket | * Backup is going to live in a versioned S3 bucket | ||
| Line 45: | Line 45: | ||
* Deploy it in chronos | * Deploy it in chronos | ||
* Schedule policy | * Schedule policy | ||
* 7 times a week | ** 7 times a week | ||
* 4 times per month | ** 4 times per month | ||
* 12 times per year | ** 12 times per year | ||
=== Databases === | === Databases === | ||
* Already backed by RDS | * Already backed by RDS | ||
* Current policy | * Current policy | ||
* 7 times a week | ** 7 times a week | ||
* Future policy | * Future policy | ||
* 7 times a week on RDS | ** 7 times a week on RDS | ||
* 12 times a year on S3/Glacier | ** 12 times a year on S3/Glacier | ||
=== Consul K/V === | === Consul K/V === | ||
* Backup is going to live in a versioned S3 bucket | * Backup is going to live in a versioned S3 bucket | ||
| Line 60: | Line 60: | ||
* Deploy it in chronos | * Deploy it in chronos | ||
* Schedule policy | * Schedule policy | ||
* 7 times a week | ** 7 times a week | ||
* 4 times per month | ** 4 times per month | ||
* 12 times per year | ** 12 times per year | ||
=== WP sites === | === WP sites === | ||
* Backup is going to live in S3 | * Backup is going to live in S3 | ||
* Use MainWP native backup functionality | * Use MainWP native backup functionality | ||
* Schedule policy | * Schedule policy | ||
* Once per week | ** Once per week | ||
=== 3rd party services=== | === 3rd party services=== | ||
==== Docker ==== | ==== Docker ==== | ||
Latest revision as of 12:35, 20 September 2016
Infrastructure backup
Assumptions
- The majority of our infrastructure is based in the idea to have as many immutable parts as possible.
- Docker images
- Marathon deployed apps
- Software stack
- Mesos
- Marathon
- Zookeeper
- Consul checks
- We should be comfortable with the loss of some of our EC2 instances
- Our EC2 based infra is HA
- "Backups" refer to a point-in-time copy of a service or resource
- We should utilize AWS hosted services to avoid maintenance overhead
- All the backups should be encrypted
Mutable part
- Persistent storage
- EFS
- Marathon app definitions
- Chronos task definitions
- Databases
- Consul KV
- WP sites
External dependencies & redundancy
At deploy time we should not rely to a single external (3rd party) service because it’s a SPOF that we don’t control. We need to have redundant access to data living in external dependencies.
- Docker images
Backup implementation
EFS
- Backup is going to live S3/Glacier
- Implement a script to do scheduled backups based on a backup tool
- Deploy it in chronos
- Schedule policy
- 7 times a week
- Lives in S3
- 4 times per month
- Lives in Glacier
- 12 times per year
- Lives in Glacier
- 7 times a week
Marathon/Chronos definitions
- Backup is going to live in a versioned S3 bucket
- Implement a script to do scheduled backups using marathon/chronos HTTP API
- Deploy it in chronos
- Schedule policy
- 7 times a week
- 4 times per month
- 12 times per year
Databases
- Already backed by RDS
- Current policy
- 7 times a week
- Future policy
- 7 times a week on RDS
- 12 times a year on S3/Glacier
Consul K/V
- Backup is going to live in a versioned S3 bucket
- Implement a script to do scheduled backups using consul HTTP API
- Deploy it in chronos
- Schedule policy
- 7 times a week
- 4 times per month
- 12 times per year
WP sites
- Backup is going to live in S3
- Use MainWP native backup functionality
- Schedule policy
- Once per week
3rd party services
Docker
- Docker registry mirror
- Maybe a hosted one
- EC2 container registry is not the best one but it’s hosted by AWS
Restoring from backup
Infrastructure
- Ansible playbooks for config management
- Terraform for resources management
Storage
- Use the backup tool to revert to a point in time
- Implementation
- Native tool functionality
Marathon/Chronos/Consul
- Redeploy the definition
- Implementation
- Write a script to populate the service definitions using HTTP API
WP Sites
- Native restore functionality in MainWP
- Implementation
- Native tool functionality
Databases
- Restore from snapshot
- Implementation
- Native tool functionality