Mobile/Testing/07 25 12

From MozillaWiki
Jump to: navigation, search

Previous Action Items

  • Jake to get an estimate for new tegras to be online - Response: https://bugzilla.mozilla.org/show_bug.cgi?id=767658#c4
  • jmaher to figure out if turning off tests on m-c improved failure rate
  • armen to send blassey a list of things on releng's plate
    • blassey to put that list in priority order (may take more than a week)
  • ateam and releng to figure out the reboots

Status reports

Dev team

  • Bug 775227 OutOfMemory or "out of memory" during mochitest

Rel Eng

Linux Foopy

  • WIP

(status from last week)

  • Linux Foopy --
    • Created a new tegra-host-utils for Linux, based on a very recent m-c. (c.f. bug 742597 c#32 )
    • Working out load-requirements with RelOps (Looking like disk i/o is our largest concern, which happens in spikes rather than continuous. Evaluating hardware vs VMs)
    • In staging these are currently able to connect to buildbot, take a job, but we are failing to run xpcshell for unit-tests, and Talos tests are also failing to properly run.
    • Will be working more on this today/this-week to try and work out remaining issues.

Tegra

  • reboot logging landed
  • reboot logging code backed out
    • it was taking the pool down - this morning we were down to 49 running tegras

Panda

  • Nothing new
  • bug 725544 - (android_4.0_testing) [tracking bug] Android 4.0 testing
    • bug 773517 completed - IT to verify PDU setup - to be done this week
    • bug 776728 - chassis acceptance testing (ateam on point)
    • bug 769428 - provide usable image that can work (a-team)
    • bug 725845 - run 10 panda boards through buildbot staging systems

Beagle

  • Nothing new
  • bug 767499 - (track-armv6-testing) Beagle Board Arm v6 Support Tracking
    • blocked on working image (a-team)

Other

IT

A Team

  • 28.46% failure rate - an immprovement from last week!
    • tests with 50% or greater failure rates: M8, rc, R3, rp, rck, rck2
  • today deployed talos changes to add telemetry preference and actually fail on __FAIL
  • a few tests disabled and turned back on
  • panda update
    • stability seems to be impacted by screen resolution, adjusting to 1024x768 resolves almost all stability problems
    • reftests are stable (except using --ignore-window-size)
    • mochitests fails 100% due to running in iframe

AutoPhone / S1/S2 Automation

  • Finished a simple smoke test for AutoPhone. Nearly landed patch to switch all adb usage to SUT; should improve reliability.

Eideticker

  • Somewhat slowed down this week as time spent unwinding/debugging tegra and panda issues
  • Misc. stability / documentation fixes
  • Dashboard actually seems to be running fairly stablely, except last night when we ran out of disk space (will fix dashboard to save less historical image analysis data, as it's clearly not of much value)
  • Found a potential regression in the CNN test via the dashboard, filed as bug 777357
  • (patch to import backdated info still under development)

Round Table

  • (armenzg/jmaher) we would like to focus on stability of current setup and work towards panda as we get out of the forest
  • The failures are killing reliability causing changes to back up on try (due to retriggers both on Try and elsewhere) and no one is paying any attention to the failures. What do we do with this?
    • Hide the particular suites with >20-30% failure rate?
    • Disable more tests to get to a reasonable failure rate?
    • Fix the reboot issue so that we can get to a reasonable failure rate?
  • bug 748488 taras/glandium: Incremental decompression is the next big startup win. It needs releng infrastructure to gather profiles.

Action Items

  • clint to get list of top failures
  • hal to get estimates for bug 748488