Sheriffing/How To/Hangs: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Created page with "Hang/Timeout Test Failures need special attention When a test hangs, and "application timed out after 330 seconds with no output," we kill the process just in case something ...")
 
No edit summary
Line 15: Line 15:
Those signatures mean absolutely nothing beyond what you're already
Those signatures mean absolutely nothing beyond what you're already
saying in the summary, application timed out after 330 seconds with no
saying in the summary, application timed out after 330 seconds with no
output, but thanks to the power of (tbpl's bug) suggestion, if you look
output, but thanks to the power of (treeherder's bug) suggestion, if you look
at hang bugs, you'll see that when we get in a hurry, we happily star
at hang bugs, you'll see that when we get in a hurry, we happily star
things like "test_HTMLElement58.html | application timed out after 330
things like "test_HTMLElement58.html | application timed out after 330
Line 27: Line 27:
only be the case for Shutdown, where we just need to gather up the
only be the case for Shutdown, where we just need to gather up the
strength to not to call an unfiled test timeout a shutdown timeout, even
strength to not to call an unfiled test timeout a shutdown timeout, even
if tbpl suggests it.
if treeherder suggests it.

Revision as of 12:10, 14 June 2017

Hang/Timeout Test Failures need special attention

When a test hangs, and "application timed out after 330 seconds with no output," we kill the process just in case something interesting was happening, and put a "crash" stack in the log.

There's almost never anything significant at the top of the stack, even when we were actually up to something, the (rare) significance is more likely to be buried in some other thread. But most of the time, we're just sitting spinning the event loop, waiting for something to happen that isn't ever going to happen, and the "crash" signature is CrashingThread(void *), or libSystem.B.dylib + 0xd7a, or linux-gate.so + 0x424.

Those signatures mean absolutely nothing beyond what you're already saying in the summary, application timed out after 330 seconds with no output, but thanks to the power of (treeherder's bug) suggestion, if you look at hang bugs, you'll see that when we get in a hurry, we happily star things like "test_HTMLElement58.html | application timed out after 330 seconds with no output" as "Windows mochitest-1,2,3 hangs on Shutdown | application timed out after 330 seconds with no output".

Please don't put any of those three things in bug summaries, please remove them when you see them, and please don't star new unfiled hangs as something utterly different in a different test. The only time we need them in the summary is when we don't have a test name, which should only be the case for Shutdown, where we just need to gather up the strength to not to call an unfiled test timeout a shutdown timeout, even if treeherder suggests it.