Buildbot/OutageReports/20070911-01

From MozillaWiki
Jump to: navigation, search

2007-09-11

On 2007-09-11 at 15:42 PDT, try1-win32-slave experienced a service outage for 3 hours.

bug 394841

What was affected:

Windows try builds on trunk.

What was the cause of the outage:

Building deps for /cygdrive/d/buildbot/sendchange-slave/sendchange-win32/mozilla/security/manager/ssl/src/nsSSLSocketProvider.cpp

command timed out: 1200 seconds without output SIGKILL failed to kill process using fake rc=-1 program finished with exit code -1

remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last): Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process ]

Subsequent build attempt failed with:

cvs [checkout aborted]: could not chdir to mozilla/build/autoconf: Permission denied


Has this type of outage happened before?

Yes.

What was done to repair:

Shell killed and reopened.

What will be done to prevent this in the future:

  • upgrade buildbot to trunk code ?
  • use KillableProcess.py ?