Services/Sync/2011Meltdown/ActionItems
Jump to navigation
Jump to search
| Build and document model of current behaviour, expected traffic | Ops/Eng | |
| Bring back and enforce system-review requirement for all client behaviour changes | Eng | |
| Identify required metrics for reliability, create dashboard to compare Nightly/Aurora/Beta/Release channels | Ops | |
| Enhanced User-Agent data required | Eng | |
| Standing review, one week before Rapid Release cutover day, of all changes landed and all metrics from above action | Eng/Ops | |
| Fix backoff handling for Fx8 + add automated tests | Eng | Done: bug 691612 bug 691663 bug 691988 |
| Add backoff handling to pre-Aurora-merge extended QA validation | Eng/QA | |
| Document and make available current system capacity numbers | Ops | |
| Determine and implement standard for required headroom on each server | Ops | |
| coordinate metrics-ops observation of Aurora/Beta traffic patterns | Ops/Metrics | |
| zeus log analysis and reporting improvements | Ops | |
| deploy TTL pruning cron | Ops | |
| push 2.6.x server-core update | Eng/Ops | |
| improve understanding of platform structure outside of ops | Ops | |
| automated stack trace processing and reporting, top crasher equivalent, server-focused | Ops/Eng | |
| Re-triage of all outstanding bugs to identify future lurking issues | Ops/Eng | |
| 90-second minimum sync interval | Eng | Done: bug 694149 |
| Create and document engineering contact list + escalation path for all applications | Eng | |
| Update new user activation load test, identify optimal+maximum user uptake rate per node | Ops/Eng | |
| gradual user reassignment per node, not per db | Ops | |
| Ensure ongoing reliability work is accounted for in planning, and deferral of maintainance | Eng/Ops | ongoing |