Services/Sync/2011Meltdown/ActionItems

From MozillaWiki
< Services‎ | Sync
Jump to navigation Jump to search
Build and document model of current behaviour, expected traffic Ops/Eng
Bring back and enforce system-review requirement for all client behaviour changes Eng
Identify required metrics for reliability, create dashboard to compare Nightly/Aurora/Beta/Release channels Ops
Enhanced User-Agent data required Eng
Standing review, one week before Rapid Release cutover day, of all changes landed and all metrics from above action Eng/Ops
Fix backoff handling for Fx8 + add automated tests Eng Done: bug 691612 bug 691663 bug 691988
Add backoff handling to pre-Aurora-merge extended QA validation Eng/QA
Document and make available current system capacity numbers Ops
Determine and implement standard for required headroom on each server Ops
coordinate metrics-ops observation of Aurora/Beta traffic patterns Ops/Metrics
zeus log analysis and reporting improvements Ops
deploy TTL pruning cron Ops
push 2.6.x server-core update Eng/Ops
improve understanding of platform structure outside of ops Ops
automated stack trace processing and reporting, top crasher equivalent, server-focused Ops/Eng
Re-triage of all outstanding bugs to identify future lurking issues Ops/Eng
90-second minimum sync interval Eng Done: bug 694149
Create and document engineering contact list + escalation path for all applications Eng
Update new user activation load test, identify optimal+maximum user uptake rate per node Ops/Eng
gradual user reassignment per node, not per db Ops
Ensure ongoing reliability work is accounted for in planning, and deferral of maintainance Eng/Ops ongoing