ReleaseEngineering/Mozpool/Handling Panda Failures/Long-Term Process

From MozillaWiki
< ReleaseEngineering‎ | Mozpool‎ | Handling Panda Failures
Revision as of 20:13, 30 November 2012 by Djmitche (talk | contribs) (Created page with " The first level of diagnosis for a panda is which failed state lifeguard has chosen for the panda. The next level is usually to look at the logs in lifeguard. ... Check the ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


The first level of diagnosis for a panda is which failed state lifeguard has chosen for the panda. The next level is usually to look at the logs in lifeguard.

...

Check the logs for anything from the board, tagged "syslog". If you see that, then the board is booting and has network. Otherwise, you'll need to investigate starting at the beginning:

  • does the board have power? (blinkenlights)
  • does the board have link? (lights on the switch)
  • does the board have an sdcard?
  • does the board become pingable even briefly if you force a power-cycle from the BMM UI?
    • if so, then power, link, and sdcard are all working at least a little bit - try a new sdcard

after all of that, if you haven't found the problem, then it's time for some serial diagnostics.

... failed_*_downloading

Most of the time, this will be either a corrupt or dead sdcard. Try swapping in another card.

Let's also try re-writing the u-boot image to the card, in case it was corrupted, but marking the sdcard somehow. If it turns out that the u-boot image gets corrupted sometimes, but re-writing it fixes that, then we can avoid trashing a lot of good sdcards. If it never helps, delete this paragraph.