Auto-tools/Goals/2012Q2: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 76: Line 76:
=== '''GOAL:''' Enhance Bugzilla Performance ===
=== '''GOAL:''' Enhance Bugzilla Performance ===
* '''Q2 Outcomes'''
* '''Q2 Outcomes'''
** {{ok|Fix release tracking flag issue}}
** {{ok|Refactor how rapid-release tracking flags are implemented for improved performance and maintainability}}
** {{risk|Upgrade BMO to 4.2}}
** {{risk|Upgrade bugzilla.mozilla.org to Bugzilla version 4.2}}
** {{done|Pulse/Push API Completion -- stretch goal}}
** {{done|Pulse/Push API Completion -- stretch goal}}
*** code complete - but not yet deployed (still testing)
*** code complete - but not yet deployed (still testing)

Revision as of 16:00, 11 June 2012

Official Q2 Goals

  • [ON TRACK] Extend Mobile Platform Automation for B2G and Fennec Native to extend our automation systems to work with specific phone hardware and new development boards for both products.
  • [MISSED] Deploy Datazilla new graph server UI into production to make it easier and simpler to track all our performance numbers across our growing sets of performance and endurance data we are collecting on our products.
  • [ON TRACK] Signal From Noise Phase II - Make the same noise-reduction changes we made on Tp5 on all the other page-load tests and ensure all performance tests are sending raw observations to Datazilla.
  • [AT RISK] Enhance Bugzilla Performance by upgrading hardware and software dependencies as well as fixing a major performance bug in BMO (a bug related to how we, as Mozilla, use bugzilla)
  • [ON TRACK] Reduce android test automation instability and make it easier for the web QA and desktop QA teams to write and run automated tests.

Official Q2 Goals

These projects must be completed to achieve the above goals.

GOAL: Extend Mobile Platform Automation for B2G and Fennec Native

  • Q2 Outcomes
    • [ON TRACK] Create build-and-flash automation for b2g builds on a specific hardware platform (likely nexus s)
      • deferred
    • [ON TRACK] Run Mochitest and reftest on b2g hardware machines
      • reftest at risk, mochi ok
    • [ON TRACK] Write tutorial on how to write and run marionette tests
      • Need to update docs to reflect new test style introduced by philikon
    • [DONE] Write 3 simple performance tests based on Marionette for b2g
      • Need to post to datazilla
    • [ON TRACK] Add Fennec Native support for marionette (stretch)
      • deferred
    • [ON TRACK] Create power-profile tests (For fennec native and b2g)
      • deferred?
    • [AT RISK] Allow Panda ES Boards to be used as automation platforms (for fennec native)
    • [DONE] Increase stability of Noah's Ark (automation on mobile phones for fennec native)
    • [ON TRACK] Add power test analysis to Noah's Ark
      • deferred
    • [ON TRACK] Architect a system for B2G crash reporting
      • FOLLOW Up with ted
  • Stakeholders
    • B2G Team
    • Fennec Native Developers
  • Depends On
    • IT for obtaining and deploying phones into haxxor
    • B2G team to define hardware automation platform
    • Datazilla (to capture data from Noah's Ark/power tests)
    • "Taming Panda" project
    • Noah's Ark

GOAL: Deploy Datazilla new graph server UI into production

  • Q2 Outcomes
    • [ON TRACK] Provide generic interfaces/web app plugability for new harnesses to reuse the same infrastructure backend
    • [AT RISK] Provide Compare-Talos tools to drill into talos regressions (from OS/Changeset to individual page contributions to overall Talos metric)
    • [MISSED] Ensure that new UI is based on extensible statistics package that can be used both by developers and the graphserver UI.
      • deferred in favor of new goal to detect regression per push
  • Stakeholders
    • Every automation project producing performance related data
    • Every developer at Mozilla - particularly Firefox/Platform developers consuming Talos data
  • Depends On
    • IT for deployment of VMs/pushing to production via puppet
    • Infrasec for sec review
    • Firefox/Platform dev focus group for early UI review and feedback
    • Signal From Noise Phase II
    • Support from Metrics to ensure visualizations in UI are accurate

GOAL: Signal From Noise Phase II

  • Q2 Outcomes
    • [ON TRACK] Perform experiments to extend row-major methods to other page-load style tests
    • [ON TRACK] Implement changes to pave the way for mozharness on mobile and desktop
    • [ON TRACK] Create tools to monitor noise in talos numbers so that we know when talos numbers become unacceptably noisy again
    • [ON TRACK] Ensure that all raw data values from all talos tests flow into Datazilla database backend
  • Stakeholders
    • Datazilla Project
    • Firefox/Platform developers that depend on talos data
  • Depends On
    • Support from Metrics to analyze the data of the experiments to extend row-major run order
    • Support from Releng to deploy changes to talos automation infrastructure

GOAL: Enhance Bugzilla Performance

  • Q2 Outcomes
    • [ON TRACK] Refactor how rapid-release tracking flags are implemented for improved performance and maintainability
    • [AT RISK] Upgrade bugzilla.mozilla.org to Bugzilla version 4.2
    • [DONE] Pulse/Push API Completion -- stretch goal
      • code complete - but not yet deployed (still testing)
  • Stakeholders
    • Entire Mozilla project
  • Depends On
    • IT for software upgrades/testing on bugzilla system
    • IT for move to SCL3 colo
    • IT for 4.2 test site deployment

GOAL: Reduce test instability and make it easier to write automated tests

  • Q2 Outcomes
    • [DONE] Fix top 3 Android infrastructure related oranges.
      • gone from 33% failure rate to 10% failure rate
    • [AT RISK] Create an on-demand VM system for selenium grid (increases stability of web QA automation)
    • [ON TRACK] Complete refactor of Mozmill API/Automation wrappers (increase stability/ease of use of Desktop QA automation)
  • Stakeholders
    • Fennec Native Developers
    • Web QA Team
    • Desktop QA Team
  • Depends On
    • Releng aid to deploy changes to Android toolchains
    • IT to provision on-demand VMs (maybe - we may be able to create VMs ourselves, since we own these ESXi hosts)

P1 Projects

These are projects that we desperately need to finish in Q2 because they also open doors for us (like the high level goals) but they are smaller in scope so they did not attain "official goals" status. They are listed in order of highest priority to least priority.

Stone Ridge

  • Aid the Necko team to deploy their network test system and run on change builds through it, allowing it to report to either a templeton dashboard or the Datazilla database (depending on whether Datazilla's generic interface is online in time)
  • Q2 Outcomes
    • [ON TRACK] Wrap Necko Tests in mozbase script so that they can be easily automated - may not be needed if necko tests are run via xpcshell or some variant of existing harness
    • [ON TRACK] Create Pulse listener to download builds for testing
    • [ON TRACK] Create JSON upload to templeton/datazilla
    • [ON TRACK] Create dashboard for analysis of results (either as standalone templeton or as Datazilla plugin view)
  • Stakeholders
    • Necko Team
  • Depends On
    • Datazilla

Pulse Enhancements

  • Improve stability, performance, and security of Pulse system.
  • Q2 Outcomes
    • [DONE] Move to new Pulse hardware in PHX
    • [ON TRACK] Improve durable Queue system
      • added monitoring to handle it but the true fix is in the new library which is not done
    • [SKIPPED] Create new library for Mozilla pulse not dependent on carrot
      • deferred
  • Stakeholders
    • All Pulse based Automation systems
  • Depends On
    • IT Support for new hardware move

Taming Panda Boards

  • Ensure that panda boards are a stable, viable automation solution for Fennec Native
  • Q2 Outcomes
    • [AT RISK] Run mochitest from end to end
    • [AT RISK] Resolve MAC address issue
    • [AT RISK] Resolve reboot issues
  • Stakeholders
    • Fennec Native Developers
    • Releng
    • IT

NSS Automation

  • Aid the NSS team so that their tests can be automated in our existing automation systems
  • Q2 Outcomes
    • [ON TRACK] Define and provide tools to ensure an acceptable workflow as we transition to more modern SCM system
    • [ON TRACK] Work with releng and NSS developers to tailor short-running tests for inclusion in buildbot automation
    • [ON TRACK] Create pulse based-system for occasional execution of long-running NSS tests (outside of buildbot automation
    • [ON TRACK] Deploy NSS short-running tests to full automation (stretch)
      • If we can define the system to run the short-running tests in buildbot that is enough. It is pushing the envelope to code these tests as well as deploy them into buildbot in one quarter. Nonetheless, that is our stretch goal.
  • Stakeholders
    • NSS Team
    • Security Team
  • Depends On
    • Releng availability for deployment of tests into buildbot automation
    • Defining workflows to support

Noah's Ark

  • Provide a stable automation system for "on-phone" automation across hardware types for Fennec Native
  • Q2 Outcomes
    • [DONE] Identify instabilities and fix them
    • [DONE] Achieve 90% uptime
  • Stakeholders
    • B2G automation
    • Fennec Native Developers

Mozharness Support

  • Provide the building blocks required for Mozharness support on android and desktop environments to simplify deployments and reduce occurrence of infrastructure of oranges.
  • Q2 Outcomes
    • [MISSED] Create a device checkout system for foopy-less management of systems
      • was a nice to have, defferred
    • [DONE] Aid with cultivating best of breed SUT managment tools (in coordination with releng)
    • [ON TRACK] Fix dependencies so that Mozbase tools (like talos) can be easily integrated with Mozharness
      • have done work - is it done? FOLLOW UP
    • [ON TRACK] Deploy simple pypi server inside our infrastructure so that slaves can easily perform dependency management at runtime (versus at slave-image time via puppet)
      • will have a solution by end of quarter
  • Stakeholders
    • Releng - we will work closely with releng to aid them in achieving this goal

Mobile Blockers Dashboard

  • Provide a simple high-level dashboard indicating the current status of Fennec, sourced from Bugzilla.
  • Q2 Outcomes
    • [DONE] Public web pages, updated daily, indicating total number of open and closed Fennec blockers by day, with links to Bugzilla.
    • [DONE] Public web pages, updated daily, indicating total number of closed blockers and nonblockers for each person involved in Fennec, split by team, with links to Bugzilla.
      • [SKIPPED] Include number of comments-on-blocker-bugs per person. This is not easily obtained via Bugzilla API, so this is a stretch goal.
  • Stakeholders
    • Engineering management, particularly damons.

Supporting Projects

Many of our Goals and P1 Projects depend on several building blocks. These are important, but if we find ourselves needing to prioritize our time, we should prioritize time on these such that they serve the goals above, and push any further advancements on these projects to future quarters. The projects here are listed in no particular order.

Bughunter

  • Teach QA and the Crask Kill team to effectively use Bughunter to diagnose and discover reproducible crashes.
  • Q2 Outcomes
    • [DONE] Create VMs for people to use the system safely
    • [DONE] Evangelize the use of the system
    • [SKIPPED] Attend crash-kill work week to find out how to best support that team with BugHunter
      • did not get invited soon enough to book travel
  • Stakeholders
    • Crash kill/Project Mgmt
    • QA

MozTrap - Manual Test Case Management System

  • Teach and help migrate QA teams to the new test case management system
  • Q2 Outcomes
    • [ON TRACK] Automation API integration so automated tests an track their action in the system
    • [DONE] Integrate with browserID for user creation
    • [DONE] Complete security review and roll out to production
    • [DONE] Aid QA in migrating to system and evangelize use (create tutorials etc)
  • Stakeholders
    • QA

A11y Automation

  • Mentor the A11y developers as they use our tools to fit the Speclinium accessibility testing framework into the Mozilla automation infrastructure.
  • Q2 Outcomes
    • [DONE] Aid the A11y team with using mozbase as a basis for their automation
    • [DONE] Aid the A11y team with using marionette
    • [SKIPPED] Create VMs for running the automation using pulse
    • [SKIPPED] Tailor automation to be buildbot-ready by end of quarter
    • [SKIPPED] Deploy into buildbot automation (stretch)
      • burns has been helping, their automation won't be ready for deploy by end of Q though.
  • Stakeholders
    • A11y developers
    • A*team (the A11y developers will find bugs in our software we'll need to fix)

JetPerf Deployments

  • Complete deployment of Addon SDK project JetPerf. (Talos performance metrics for AddonSDK developed addons)
  • Q2 Outcomes
    • [DONE] Implement enough of Mozharness infrastructure to deploy jetperf on desktop talos
    • [ON TRACK] Deploy jetperf talos into buildbot automation
  • Stakeholders
    • Addon SDK (jetpack) team
  • Depends On
    • Releng to deploy into buildbot automation

Eideticker

  • Use the Eideticker automation to track our progress on rendering and checkerboarding performance, particularly compared to our competition.
  • Q2 Outcomes
    • [DONE] Get chrome checkerboarding measurements working with galaxy nexus
    • [DONE] Automate checkerboarding analysis
  • Stakeholders
    • Fennec Native Dev Team
    • Fennec Marketing team

WOO

  • Add the orange seed feature to help developers discover when a test first went intermittent. Also, move to a modern staging/production system.
  • Q2 Outcomes
    • [ON TRACK] Implement Orange Seed feature
    • [MISSED] Deploy to production using staging/production VMs
      • Pending transfer of ES control to IT so that a development ES database to be available that could be used so that OF can use it. Need IT to make a decision about whether to upgrade capacity on IT's dev ES cluster or create a dev ES cluster on the metrics ES instance.
      • ES also needs to be upgraded (IT needed)
      • Also needs to be re-indexed (IT needed)
  • Stakeholders
    • Platform developers trying to Juice Oranges

Speedtests

  • Enhance the speedtests with the Kraken JS test as well as mobile measurements.
  • Q2 Outcomes
    • [DONE] Add Kraken to the test matrix
    • [DONE] (new) Add V8 to the test matrix
    • [SKIPPED] Add mobile support to the tests so that we run the canvas demo on phones
      • Skipped in favor of getting autophone more reliable to be used as the basic platform here.
  • Stakeholders
    • JS team
    • Fennec Native team

Peptest & Telemetry

  • Wire peptest and telemetry together so that we can use telemetry data to annotate what happened during the unresponsive moments that peptest detected.
  • Q2 Outcomes
    • [DROPPED] Integrate Peptest with telemetry probes to better measure responsiveness
      • Dropped in favor of other higher priority work for mobile
    • [ON TRACK] Aid developers with writing peptest patches
      • Dependent on datazilla
    • [ON TRACK] Complete peptest-talos style reporting system
    • [ON TRACK] Integrate Peptest reporting into Datazilla
  • Stakeholders
    • Platform/Firefox Developers
    • Snappy team
  • Depends On
    • Datazilla project

Mozhttpd

  • Investigate whether android test stability and mochitest turn around time can be improved by replacing httpd.js with a python webserver.
  • Q2 Outcomes
    • [DONE] Investigate using Mozhttpd for mochitest webserver to see if turnaround time can be decreased
      • Have a POC working for some directories. Still investigating.
  • Stakeholders
    • Releng (frees up slave time if this works)
    • Developers (improves end-to-end result time if it works)
    • Fennec Developers (significantly simplifies running Fennec tests by hand)

W3C Test Mirroring for CSS WG

  • Provide a semi-automated mechanism to help the Layout team submit reftests to the CSS working group as well as enable them to easily incorporate the CSS working group's tests in our on-change testing.
  • Q2 Outcomes
    • [AT RISK] Complete code for mirroring solution
    • [AT RISK] Deploy to VM for automation
  • Stakeholders
    • Layout Team (fantasai is our main customer)

Powerball

  • Analyze game design as a means to improve community engagement across development/qa.
  • Q2 Outcomes
    • [ON TRACK] Create design plan for community building game
  • Stakeholders
    • Ourselves, at the moment

Addons Automation

  • We have two separate systems for automated addon testing. Neither system solve the entire problem. We need a plan in place to correct this. Intended as preparation for a Q3 goal.
  • Q2 Outcomes
    • [ON TRACK] Drive consensus around a comprehensive architecture for addon test automation
  • Stakeholders
    • AMO team
    • AMO developers
    • Platform developers
    • Releng

BuildFaster

  • The BuildFaster dashboard is currently down. It's an important tool for tracking whether our end-to-end build/test times are regressing, so we should try to get it back up.
  • Q2 Outcomes
    • [DONE] The GoFaster dashboard (in particular the end-to-end times and buildcharts view) should be publically accessible again.
  • Stakeholders
    • Releng
    • Developers

Community Involvement Goals

  • [DONE] Establish best practices to become the best community integrated development team at Mozilla
  • [ON TRACK] Blog about those practices (as we prove them)
  • [DONE] Promote two community folks to Mentor status.
  • [ON TRACK] Set individual blogging targets and meet them.