ReleaseEngineering/Mozpool/Handling Panda Failures/Long-Term Process: Difference between revisions
Jump to navigation
Jump to search
(Created page with " The first level of diagnosis for a panda is which failed state lifeguard has chosen for the panda. The next level is usually to look at the logs in lifeguard. ... Check the ...") |
No edit summary |
||
| Line 1: | Line 1: | ||
= Handling Panda Failures = | |||
Failing pandas are passed between three groups. | |||
== Release Engineering == | |||
* Check for causes in release engineering automation | |||
* Hand to relops using <<bug process>> | |||
.. | == Release Operations == | ||
* Look for new problems | |||
* Look for and work around known but unfixed failure states (e.g., sut_verify) | |||
* Hand off to DCOps <<bug process>> | |||
== DC Operations == | |||
* check for power | |||
** inspect green lights on the panda board | |||
** if no lights, unplug power cable and check with a volt meter. Positive probe is inserted inside barrel plug and negative probe is touched to outside barrel plug. This should read approx 5 volts. | |||
*** <b> DO NOT LET BARREL PLUG TOUCH THE CHASSIS OR OTHER PARTS.</b> This will cause a short and blow the fuse. | |||
*** if no voltage is present, check fuse. If fuse is blown, remove fuse and file a bug with relops | |||
* | * check cat5 cables | ||
* | ** inspect internal and external cat5 cables (use fluke) | ||
* | ** make sure cables are securely inserted into RJ45 jacks | ||
* | |||
* check SD Card | |||
** If power and cat5 do not have any obvious problems, preceded with replacing SD Card | |||
*** the replacement SD Card should be new and have a fresh preseed image installed | |||
*** make sure SD Card is securely inserted | |||
*** Deliver used SD Cards to Relops for testing or decommission | |||
If all of the above has been performed and the panda board still shows problems, reopen the tracking bug to have DCOPS replace panda board. The failed panda should be given to relops for further diagnostics or decommissioning. | |||
Latest revision as of 17:24, 12 March 2013
Handling Panda Failures
Failing pandas are passed between three groups.
Release Engineering
- Check for causes in release engineering automation
- Hand to relops using <<bug process>>
Release Operations
- Look for new problems
- Look for and work around known but unfixed failure states (e.g., sut_verify)
- Hand off to DCOps <<bug process>>
DC Operations
- check for power
- inspect green lights on the panda board
- if no lights, unplug power cable and check with a volt meter. Positive probe is inserted inside barrel plug and negative probe is touched to outside barrel plug. This should read approx 5 volts.
- DO NOT LET BARREL PLUG TOUCH THE CHASSIS OR OTHER PARTS. This will cause a short and blow the fuse.
- if no voltage is present, check fuse. If fuse is blown, remove fuse and file a bug with relops
- check cat5 cables
- inspect internal and external cat5 cables (use fluke)
- make sure cables are securely inserted into RJ45 jacks
- check SD Card
- If power and cat5 do not have any obvious problems, preceded with replacing SD Card
- the replacement SD Card should be new and have a fresh preseed image installed
- make sure SD Card is securely inserted
- Deliver used SD Cards to Relops for testing or decommission
- If power and cat5 do not have any obvious problems, preceded with replacing SD Card
If all of the above has been performed and the panda board still shows problems, reopen the tracking bug to have DCOPS replace panda board. The failed panda should be given to relops for further diagnostics or decommissioning.