Services/NOC (proposal)
From MozillaWiki
< Services
SREs
- Minimum 7 person team for 24/7 coverage.
- 8-9 is better (due to built-in turnover, addressed below).
- Should be "Tier 1+"
- will have root
- should have reasonable associate-level Linux/Juniper/something-useful skills
- There may never be less than one SRE (or more senior temp coverage, in emergencies) in the NOC.
- Position is intended as a "gateway" position into Mozilla
- 18-24mo minimum tour-of-duty
- After 9-12mo, NOC staff are expected to work on external team projects where they hope to reorg to
- A rotational Swing-shift will be offered to facilitate this work.
Duties
- <5 min Ack of issues
- Pages escalate to Secondary in 5 minutes
- Monitoring of key IRC channels
- Communicate large-scale issues to IRC.
Personnel
- Initial team will be comprised of externally-hired consultants
- Half (Secondary coverage) *may* be remote, as long as they're in a single facility for hand-offs
- As we hire FT SREs, we will eliminate on-site consultants (let their contracts expire)
- We may decide to keep secondary coverage as a dedicated external consultancy if it works well for us
Scheduling
Assignment
- SREs will rotate as follows: B/Secondary (2 weeks) -> A/Primary (2 weeks) -> C/Tertiary (2 weeks) -> S/Swing (1 week)
- SREs will maintain either the 1st or 2nd shift, unless on Swing which only has 1 shift.
- If the SRE prefers off-shift and the Swing tasks allow it, Swing shift timing is adjustable.
Facility
- segregated physically
- allows closed-door "war room" focus for events
- default, however, is door-open, if the person on-staff wants this (social).
- dedicated attached conference room.
- Multiple (6+) large-screen displays
- metrics (2+)
- monitoring (2)
- internal (nagios)
- external (watchmouse)
- video conference (1)
- satellite-fed news (1)
- key twitter feeds (1?)
- IRC (1?)
- 2-3 "NOC" desks
- Each with a desktop system with 2-4 vertically-oriented screens
- Each with a very good speaker phone
- 2 "hotel" desks (don't necessarily need a good view of the screens)
- dual discrete ethernet feeds
- dedicated ("real") VPNs to prod
- require cellular modem for backup connectivity