Mobile/Testing/08 14 13

From MozillaWiki
Jump to: navigation, search

Previous Action Items

  • address roundtable item from this week in next weeks' meeting

Status reports

Dev team

Rel Eng

  • kmoir bug 881293 Run xpcshell tests on Android 4.0 try only, hidden - merged into production today, reallocating b2g pandas to android 4.0, testing android talos patches in staging

IT

  • PO for X86 infrastructure has been submitted to the vendor, they estimate an end of August delivery. Melissa is following up for actual dates.
  • Starting planning discussions for movement of infrastructure out of SCL1 and into SCL3. SCL1 lease is up on 7/31/2014.

A Team

  • tegra failure rate: [0.76%]
  • panda failure rate: [26.99%]
    • excluding retries: 3.17%
    • over half of our retries occur while running robocop-2
  • Mobile Bugs in Orange Factor Top 10
    • bug 690176 Intermittent Android "talosError: 'initialization timed out'"
    • bug 807230 Intermittent DMError: Automation Error: Timeout in command {ls,ps,isdir,mkdr}
  • Work in progress on bug 858622 to run jit-tests on mobile
  • Landed patch for bug 900542 to fix up panda manifests
    • We are getting many failures due to test suite running too long (need to break up into more chunks)
    • Some new non-intermittent failures, some are bugs, some need tweaks to range for fuzzy-if

x86 automation

  • armengz hopes to have x86 unit tests running on ash any day now (one emulator per machine, some tests failing, etc)
  • still investigating a few xpcshell and robocop test failures
  • recently noticed crashes during robocop tests

Autophone

Eideticker

(wlach not available for meeting so read-only)

  • LG-G2X still not posting results to dashboard. [ctalbert] and [wlach] tried to figure out what's up with LG-G2X, and still have no clear ideas. Tracking future work in bug 904727
  • Galaxy Nexus is faring better, but still having occasional hangs when profiler is enabled. Tracking in bug 903011

Round Table

  • It appears that the current version of sutagent (1.18) in the tree does not work reliably in production. Since our last aborted attempt to upgrade (see bug 885155), some code was added which should help diagnose the problems we thought we were having with finding the testroot (see bug 885365, but we have no evidence that the actual issue was fixed. What are the next steps here?
    • [gbrown] See bug 894454 -- it *might* help.
    • [bc] bug 879489 moved the creation of /data/local/tmp/tests to after the check if the external storage is available. There may be a timing issue which was papered over by the creation of /data/local/tmp/tests prior to the check for external storage. We can move it back and see if this helps.
  • [dminor] Any theories as to why we have so many retries while running robocop-2 on pandas?
  • [blassey] update b2g emulator to take Benooit's patch bug 905141?
  • [blassey] test suite visibility/enabled in a way that will migrate?
  • [blassey] heads up for bug 905217 - Expand the canvas2D mochitests to test more than just one canvas size

Action Items

  • kmoir to run the 1.19 sutagent in staging - point her to the zip file of it
  • panda issue - probably better to fix by breaking robocop into more chunks so that they can run faster, going to need to look at that.
    • gbrown will file a bug on splitting robocop out
    • gbrown, dividehex, ctalbert, and blassey will discuss experimenting with updating the android image the pandas use so that we mitigate some of these issues. (This might be something for a Q4 bucket list)
  • Ctalbert to help blassey & co update b2g emulator to turn on hw accel and not have the reftests fail. bug 905141
  • dminor and ctalbert to look into 690176 as to why it suddenly spiked (possibly mozharness related?)
  • gbrown to file bugs for crashes on x86 robocop in libgl. Maybe skia related?
  • Wlach is going to look into the issues with the watcher on the g2x. Interestingly, Patrick Mcmanus has a blog post about android power saving mode disabling the wifi radio - see if his change to prevent this might hav eadversely affected this phone (by comparing the time of this change with the increase in dropped network issues with the phone (verifiable through our nagios alerts).
    • Looks like the watcher is not being restarted by android such that it runs through its normal startup sequence, going to investigate and fix.