Auto-tools/Meetings/2015-03-09

From MozillaWiki
Jump to: navigation, search

Contents

Notices, Highlights, Roundtable

Newsgroup and Blog Posts

Goal Updates

Note: Items belonging to Supporting Tasks and Backlog are not part of Q1 goals and may not necessarily be completed this quarter.

Also see our Trello board

Marionette

[DONE] Implement the set of features needed to support conversion of all prioritized mozmill tests to Marionette [chmanchester, AutomatedTester]

  • details: implement modal dialog support, separation of Marionette client into a separate package, release of older version of Marionette client to support update tests, modification of harness to dynamically load appropriate version of Marionette client
  • bugs: bug 906712, bug 1109183, bug 1107336
  • progress since last update: All Done!

Support the conversion of targeted P1 mozmill tests to Marionette and get them running in CI [hskupin]

  • details: Perform the CI work necessary to get mozmill tests converted to Marionette running for update tests, and security tests. This entails writing new Marionette test libraries for these features, and getting converted tests running on the existing mozmill CI systems.
  • stretch goal: get test results reported to Treeherder
  • stretch goal: support the conversion of Search tests
  • progress since last update:

Resolve P1 bugs blocking the release of Marionette 1.0 [AutomatedTester, ato, jgraham]

Supporting Tasks

  • train QA on writing Marionette Greenlight Tests

MozReview and Autoland (joint with RelEng)

Add support for autolanding from MozReview to try [mcote, dminor, mdoglio]

  • details: Allow developers to trigger try runs from directly within MozReview
  • bugs: bug 1109218, bug 1121616
  • progress since last update:
    • Reworked Autoland docker images to work inside version-control-tools test framework, to allow testing of autoland UI elements without an external server
    • Added mach commands and additional tests for autoland

[DONE] Better Bugzilla integration with MozReview [mcote]

  • details: Make MozReview data in Bugzilla more useful by creating a Bugzilla field that contains dynamic information about a MozReview review request
  • bug: bug 1102428
  • progress since last update:
    • Back end deployed to MozReview; CORS enabled.
    • Bugzilla extension reviewed, committed, and will be deployed the night of March 9.

Supporting Tasks

  • continue improvements to the multi-commit UI

Perfherder

Ingest all Talos data with Treeherder and develop a UI that can be used to view current and historical data [wlach]

  • details: We want to use Treeherder to store and display Talos performance data, because Datazilla is being deprecated and Graphserver doesn't support the kind of performance analysis we'd like to perform on the data.
  • progress since last update:
    • Work on summary series with geometric mean landed last week (bug 1108832), which is an important feature from the point of view of graphserver-parity. Unfortunately it needed to be backed out, but I'm hoping to fix the problems and land a new version this week.
    • Various work on the GUI, including a neat feature to highlight a revision by mishravikas (not yet landed, but close)

Treeherder

Distinguish between Tier 1 and Tier 2 jobs [mdoglio]

  • details: Tier 1 and Tier 2 jobs will have different sheriffing guidelines and different expectations. Accordingly, we need to display them and allow users to interact with them differently.
  • bug: bug 1113322
  • progress since last update:

Develop back end and API for a structured log viewer [camd]

  • details: Develop a back end that ingests structured logs from test harnesses, creates a structured summary, and makes them available via a REST API.
  • bug: bug 1113873
  • progress since last update:
    • [mcote] Scoped goal down to just the back end in order to focus more on general Treeherder bugs and performance issues.
    • Pull Request in review for merge.

[DEFER] Develop a minimal UI that sheriffs can use to file new intermittents [edmorley]

  • details: develop a usable but minimal UI that sheriffs can use to quickly file new intermittent issues; the UI should automatically fill out some of the details that sheriffs normally have to find manually. Future iterations can improve on this by auto-filling more fields.
  • bug: bug 1117583
  • deferred to allow edmorley to focus on Treeherder operational issues and TBPL EOL

Supporting Tasks

  • Continue to improve the performance and operational aspects of the system
  • Identify and resolve remaining issues blocking TBPL EOL - treeherder parts: bug 1059400, other parts: bug 1054977

Backlog

  • Change the authentication model to separate credentials from repos
  • Implement OrangeFactor within Treeherder
  • Create a developer-centric view

Bugzilla

Implement an alternate bug view [glob]

  • details: Implement an alternative view of bugs, to provide UX and responsiveness improvements, and as a foundation for task-/team-centric views.
  • bug: bug 1068655
  • progress since last update:
    • ready for review!
    • well.. actually waiting for a review on an upstream backport before i can finalise my patch (bug 1139749)

Implement versioning framework for REST API [dkl]

  • details: Version the REST API to provide stable endpoints for users and a place for unstable development.
  • bug: bug 1051056
  • progress since last update:
    • First patch under review

GitHub authentication [dylan]

  • details: Allow users to authenticate with Bugzilla using their GitHub account. This will encourage more contributors and allow us to better integrate GitHub into the Mozilla workflow.
  • bug: bug 1118365
  • progress since last update:

First patch under review. It better remembers the original page (that triggers a login event) than Persona.

DevTools Harness

[AT RISK] Get the DevTools harness running in continuous integration [ted]

  • details: Take the prototype that was developed in 2014 Q4 (https://github.com/luser/luciddream) and get it running in continuous integration and visible in Treeherder. It's TBD whether this will be run in buildbot or TaskCluster, but we should get it running somewhere per-commit this quarter against linux desktop Firefox and a B2G emulator.
  • progress since last update:

CloudServices Automation

Supporting Tasks

  • Take existing client/server automation that is already running and get it reporting to Treeherder using the Tier 2 UI/workflow; bug 1108259

Test Infrastructure

Define and document Tier-2 and non-Buildbot jobs [bc]

  • details: Define and document all aspects of Tier-2 and non-Buildbot jobs: rationale, criteria, necessary enhancements to Treeherder and automation frameworks.
  • bug: bug 1121655
  • progress since last update:
    • Working on bug 1133580 - Autophone - support job retriggers and cancel notifications from Treeherder to finalize the details required in supporting Treeherder reporting for non-buildbot jobs. Should have patches for Autophone finalized in early this week.
    • After bug 1133580 is completed, I will be devoting my time to writing documentation.

Prototype a retrigger-based bisection tool [armenzg, jmaher]

  • details: Create a prototype of a command-line tool that can be used by sheriffs and others to automate retrigger-based bisection. This could be used to help bisect new intermittent oranges, and to backfill jobs that have been skipped due to coalescing. Integration with Treeherder or other service will be done later.
  • progress since last update:
    • Roadmap/milestones
    • Use cases
    • Backfilling blog post
    • Big thanks to adusca and vaibhav1994 for their endless contributions.
    • We can now *backfill* meaning that knowing a bad job we can trigger everything that is missing between it and the last good known job.
      • Sheriffs to try it once the need arises
    • Basic ability to find intermittent oranges is also available (generate_cli.py), however, polishing is needed
      • In the long term, developers want a "one-button" approach without needing to run anything on their machine
      • They just want to receive an email pointing to which changeset introduce an intermittent issue
    • All the basic pieces of the project are in-place; Lots of bug fixes and assumptions have been put to the test
    • The project is now on "beta"
    • The project has a lot of potential. It is now a matter to get users for it and meets new needs
    • Work needs to go in other quarters to improve the internal parts and add test coverage
    • Lots of feedback and ideas from mconley/ehsan

Store high-resolution testcase data ("ActiveData") [ekyle, ahal]

  • details: Create a Proof of Concept “big data” project which will store information about every test file we run: test status, error details, test machine and test duration to begin with. We will use this project to develop schemas and queries that work with data this large, and we will use this data to normalize chunk sizes and provide details about which tests never fail.
  • progress since last update:
    • The culprit for ETL slowness turned out to be ElasticSearch bulk loading speed: With ES loading disabled, the process can run on one machine (instead of three). A second machine is still on to perform backfills and catchup when bugs are discovered.
    • ETL changed to put fully transformed records back to S3. This supports loading Redshift (via COPY command), and hopefully can be used by an ElasicSearch River plugin for faster loading.
    • Redshift filled with 1/2 billion records. ODBC transfer proved too slow, so Redshift COPY command was used (great for single-shot loading, but untested for continuous streaming)
    • Queries for chunk test timing written for both ActiveData and Redshift;
  • Next steps:
    • Review Spark: An in-memory ETL solution (?on top of Hadoop?).
    • Summarize the benefits and detriments of the three technologies.

Implement the ability to normalize chunk durations in mochitest [ahal]

  • details: For mochitest variants on desktop and B2G, modify manifestparser and the test harnesses to be able to specify which tests are run in specific chunks.
  • stretch goal: Implement the same feature for Android mochitest, which still uses old-style JSON manifests.
  • bug: bug 1124182
  • progress since last update:
    • (landed) chunk-by-runtime algorithm - bug 1137339
    • (landed) bug fix related to build system + manifestparser interaction - bug 1134395
    • (in progress) getting close to making mochitest use manifestparser chunking - bug 1131098
    • (in progress) started work on generating + landing runtimes.json files - bug 1139904
    • (not yet started) add ability to use runtime chunking to mochitest harness

[DONE] Create Android 4.4 4.3 emulator image for automated tests [gbrown]

  • details: Continue the work in bug 1062365 to build an emulator image based on Android 4.4 4.3 that is capable of running automated tests.
    Deliverable includes:
    • a prototype image
    • instructions for re-creating the image
    • demonstration that tests can be run on image
      NOT included in this deliverable:
    • tests running in continuous integration
    • "greening" of tests
  • progress since last update:
    • [mcote] We decided to go with 4.3, since it makes little difference for mobile dev but is significantly easier to get going than 4.4.
    • can run all test suites against Opt build on Android 4.3 image in r24 emulator on aws (try push against custom mozharness)
    • emulator update deployed
    • will continue to work on greening of tests, deploying image, scheduling tests

Help Releng reduce test load [jmaher]

  • details: This quarter, we’ll validate the data from SETA and provide some recommendations to Releng about which jobs/platforms we could schedule less often in order to reduce test load. We’ll monitor the impact of these changes in terms of sheriffing burden and the number of retriggers this demands, and may adjust as needed. In subsequent quarters, we’ll use additional data from the high-resolution testcase data project and OrangeFactor to provide more finely-tuned scheduling changes.
  • progress since last update:
    • treeherder API change caused lost data, working on backfilling that
    • waiting on releng to deploy

Supporting Tasks

  • Help green up tests on OSX 10.10
  • Apply --run-by-dir to all mochitest harnesses
  • Remove legacy JSON manifests in favor of manifestparser manifests
  • Provide alternate solutions for the last consumers of Datazilla and work to decommission it
  • Work with devs to introduce more dynamic analyzers (like Ehsan’s setTimeout check) in test harnesses
  • Automate Windows symbol fetching, bug 1117741 [ted]
  • Add ssltunnel support to Android tests, bug 1084614

Performance Testing

Deliver training to at least 2 people for Talos performance sheriffing [jmaher]

  • details: We want to expand the pool of people who can perform performance sheriffing to make it scale better, and to reduce the bus factor problem.
  • progress since last update:
    • DONE

Supporting Tasks

  • Continue sheriffing Talos performance regressions
  • Add new benchmarks as needed to mozbench
  • Create a new UI for mozbench results that doesn’t require Datazilla
  • Improve e10s support for Talos tests and infrastructure
  • Move Talos into the tree
  • Get rid of talos.zip
  • Make running Talos locally easier
  • resources: jmaher, dminor (for mozbench primarily), and contributors

Platform QA

  • [ON TRACK] [sydpolk, maja_zf] Eliminate Flash issues on YouTube by supporting the media team in shipping MSE on YouTube for Windows Vista and higher through manual and automated testing

Supporting Tasks

  • [DONE] Run web-platform tests for MSE in continuous integration for all supported platforms.

Community

Increase 'contributor friendliness' of our projects [jmaher, all]

Supporting Tasks

  • Start tracking at least three community-related metrics over time

Other Project Updates

Holidays and Trips

Misc