Services/NOC (proposal): Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
Line 10: Line 10:
** After 9-12mo, NOC staff are expected to work on external team projects where they hope to reorg to
** After 9-12mo, NOC staff are expected to work on external team projects where they hope to reorg to
** A rotational Swing-shift will be offered to facilitate this work.
** A rotational Swing-shift will be offered to facilitate this work.
== Duties ==
* ''<5 min Ack'' of issues
** Pages escalate to Secondary in 5 minutes
* Monitoring of key IRC channels
** Communicate large-scale issues to IRC.


=Scheduling=
=Scheduling=
* A-shift start/quit times (PST): 0000, 0800, 1600
* A-shift (Primary) start/quit times (PST): 0000, 0800, 1600
* B-shift start/quit times (PST): 0200, 1000, 1800
* B-shift (Secondary) start/quit times (PST): 0200, 1000, 1800
* Swing-shift time: ~0900-1700
* Swing-shift time: ~0900-1700
== Assignment ==
* SREs will rotate as follows: Secondary (2 weeks) -> Primary (2 weeks) -> Swing (1 week)


=Facility=
=Facility=

Revision as of 23:05, 2 February 2012

SREs

  • Minimum 7 person team.
    • 8-9 is better (due to built-in turnover).
  • Should be "Tier 1+"
    • will have root
    • should have reasonable associate-level Linux/Juniper/something-useful skills
    • There may never be less than one SRE (or more senior temp coverage, in emergencies) in the NOC.
  • Position is intended as a "gateway" position into Mozilla
    • 18-24mo minimum tour-of-duty
    • After 9-12mo, NOC staff are expected to work on external team projects where they hope to reorg to
    • A rotational Swing-shift will be offered to facilitate this work.

Duties

  • <5 min Ack of issues
    • Pages escalate to Secondary in 5 minutes
  • Monitoring of key IRC channels
    • Communicate large-scale issues to IRC.

Scheduling

  • A-shift (Primary) start/quit times (PST): 0000, 0800, 1600
  • B-shift (Secondary) start/quit times (PST): 0200, 1000, 1800
  • Swing-shift time: ~0900-1700

Assignment

  • SREs will rotate as follows: Secondary (2 weeks) -> Primary (2 weeks) -> Swing (1 week)

Facility

  • segregated physically
    • allows closed-door "war room" focus for events
    • default, however, is door-open, if the person on-staff wants this (social).
    • dedicated attached conference room.
  • Multiple (6+) large-screen displays
    • metrics (2+)
    • monitoring (2)
      • internal (nagios)
      • external (watchmouse)
    • video conference (1)
    • satellite-fed news (1)
    • key twitter feeds (1?)
  • 2-3 "NOC" desks
  • 2 "hotel" desks (don't necessarily need a good view of the screens)
  • dual discrete ethernet feeds
    • dedicated ("real") VPNs to prod
    • require cellular modem for backup connectivity