Auto-tools/Projects/Marionette update tests

From MozillaWiki
Jump to: navigation, search

Mozmill tests were re-written using Marionette (m21s project). The current code lives in [1] repository. The update tests were also ported and are to be run as part of the automated release process (bug 1148546).

The harness has been developed as a mozharness script firefox_ui_updates.py.

To see any remaining work before this update tests can substitute manual work is described in bug 1182796.

Support

Only Gecko 38 and higher is supported.

Testing matrix

NOTE: We currently test all releases for all locales starting from Gecko 38. This should be reduced in a follow up bug.

The following block needs to be reviewed and represent what is agreed in here: https://github.com/mozilla/mozmill-ci/issues/535

There's been some discussions about:

  • testing all locales for the last X betas (maybe 3)
  • testing all locales for 1 beta for the last Y versions (maybe 3)
  • testing all locales for the latest esr

For example:

  • If on gecko 40, beta2:
    • 39.0b6, 39.0b7, 40.0b1
    • 35.0bX, 36.0bY, 37.0bZ (X, Y & Z belonging to the range of 1 to the latest beta version for that gecko version)
      • To make things easy we might just test the last beta of the specific gecko version
    • 38.0.latest_esr

Beta users on RC

When we're creating the RC builds, we update *beta* users to it (even though an RC, at the end, is meant for the release users).

This means that after we're done with the RC, we get them back into the betas rather than leave them with the release users.

Here's an example on how to test the update path for a *beta* user on an RC:

python scripts/firefox_ui_updates.py \
--installer-url http://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/38.0-candidates/build3/mac/en-US/Firefox%2038.0.dmg \
--firefox-ui-branch mozilla-beta \
--update-channel beta-localtest \
--cfg developer_config.py

If I didn't set --update-channel, it would update to 38.0.5 instead of the latest beta.

Setup steps

hg clone http://hg.mozilla.org/releases/mozilla-{beta,release}
cd testing/mozharness

Specific tests

Currently we run two tests for each locale: direct update and fallback update.

You can use append --update-direct-only and --update-fallback-only to only run one of the two.

Chunking

In the automation we run the scripts with --this-chunk/--total-chunks to parallelize the execution: To see what would run on which chunk you can use --dry-run True.

Configuration files

There is a generic_releng_config.py with basic values to run inside of Release Engineering's infrastructure. There is also platform spefic config (think 32-bit vs 64-bit on the same host) to set the MINIDUMP_STACKWALK variable for crash dumps (see patch). There is also the developer_config.py which allows you to run the harness on your local machine.

Status summary of failures

You can look in the logs and look for:

SUMMARY - Firefox UI update tests failed locales:

Harness options

You can see available options by running this:

python scripts/firefox_ui_updates.py --help

Internally

The mozharness script takes care of setting up the environment, checking out repositories, updating to the right branches, setting the right environment variables and execute the tests.

Internally, we call the [2] binary that gets generated from the firefox ui tests repository (the runner is defined in [3]).

A basic call to the binary looks similar to this:

firefox-ui-update --installer /path/to/installer/Firefox%2038.0.dmg --gecko-log=/path/to/cwd/build/gecko.log

The actual tests are defined in [4].

Run on your local machine

NOTE: If you see a failure on the release logs, you will also have printed out the exact command you need to run.

Here's a sample command from one of the release jobs:

# This is a sample command for running Linux64 jobs
python scripts/scripts/firefox_ui_updates.py --cfg generic_releng_config.py \
--cfg generic_releng_linux64.py --firefox-ui-branch mozilla-beta \
--update-verify-config mozBeta-firefox-linux64.cfg \
--tools-tag FIREFOX_40_0b7_RELEASE_RUNTIME --total-chunks 6 \
--this-chunk 1 --build-number 1

All you would have to do:

  • copy a command from a log (use sample command above)
  • append --cfg developer_config.py to run it locally on your machine
  • remove one of the two "scripts/" in the command

The code has been optimized to run on your local machine.

Developer mode

To run anything on your local machine you will need to append --cfg developer_config.py. The script will tell you if you don't. It clears any hardcoded paths that are used for Release Engineering machinery.

Without using releng configs

Instead of using the Release Engineering update verify configuration files, you can also run the harness in these different ways:

--installer-url

python scripts/firefox_ui_updates.py --firefox-ui-branch mozilla-beta --installer-url http://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/38.0-candidates/build3/mac/en-US/Firefox%2038.0.dmg --cfg developer_config.py

--installer-path

python scripts/firefox_ui_updates.py --firefox-ui-branch mozilla-beta --installer-path `pwd`/Firefox%2038.0.dmg --cfg developer_config.py

Running it more than once

The harness allows you to run specific steps in isolation. You can see the available actions like this:

python scripts/firefox_ui_updates.py --list-actions
Actions available:
    * clobber
    * checkout
    * create-virtualenv
    * determine-testing-configuration
    * run-tests

If you run once the script, you can run it another time but only execute certain steps or skip certain ones.

For instance, you can say --no-clobber, which will not remove the repositories that the script has checked out.

If you specify specific actions (e.g. --run-tests) it will *only* executed that action and ignore the others.

Test your changes on the Try server

The Firefox UI tests are jobs that run when a release is running. Unfortunately, you can't land changes on the various repositories and easily test your new changes until the next release happens. This makes the development and testing cycle very difficult.

There is a hacky way to run these jobs on treeherder. What we're going to do is hijack a job on buildbot to execute the firefox_ui_updates.py script and fake which arguments the script was called with.

Follow these steps:

  1. have a checkout of mozilla-central
  2. cd mozilla-entral/testing/mozharness/scripts
  3. rm desktop_unittest.py; ln firefox_ui_updates.py desktop_unittest.py
  4. apply this patch
    • Just before calling super() it modified the sys.argv values
    • You need to specify through ENABLE_BITS if to test 32-bit or 64-bit builds
  5. after that you can push to try with this commit message (try: -b o -p android-x86 -u none -t none)
    • The try message should contain at least one platform that would be triggered on buildbot, otherwise, we won't be able to use BuildApi since the revision won't exist unless there is at least one platform to schedule
  6. once you push to try, you will have to schedule a job with Mozilla CI tools
    • pip install -U mozci
    • export REVISION=your_try_revision_here
    • cancel on TH the running Android x86 build
    • See section below on how to trigger the tests

Selecting the right platform and the right command

In order to set the right command on a job, you will have to grab it from these two sets of commands. On a push we can either test the 64-bit platforms or the 32-bit ones.

Use one of these two:

  • Ubuntu 64-bit, Windows 8 64-bit and Mac OS X
ENABLE_BITS=64
export REVISION=03abd4caa54e && \
mozci-trigger -b "Ubuntu VM 12.04 x64 try opt test mochitest-1,Rev5 MacOSX Yosemite 10.10 try opt test mochitest-1,Windows 8 64-bit try opt test mochitest-1" -r $REVISION --file http://dummy.com
  • Ubuntu 32-bit and Windows 7 32-bit
ENABLE_BITS=32
export REVISION=20bbdfa06139 && \
mozci-trigger -b "Ubuntu VM 12.04 try opt test mochitest-1,Windows 7 32-bit try opt test mochitest-1" -r $REVISION --file http://dummy.com

NOTE: Using --file is a necessary hack that avoids checking if there is an existing build

Using your own repositories

You can append to the command this values:

  • --firefox-ui-repo <git_repo>
  • --firefox-ui-branch <git_branch>
  • --tools-repo <hg_repo>
  • --tools-tag <hg_tag>

Re-triggering jobs

You don't need to push more changes to try if all you require is further changes to a Firefox UI repo or a tools repo. You can simply push to your non gecko repos and re-trigger the job on Treeherder.

Extra notes

The update tests allow you to specify some more options. Please see all the --update-* options of the "firefox-ui-update --help" command. What you definitely also need is --update-allow-mar-channel, --update-target-version, and --update-target-buildid. The latter two are for final checks that Firefox has been updated to the correct target version. The MAR channel option is for updating a release build to a beta build, which happens for RC builds on the beta channel.

Here are the two examples:

1. Ensure that the update has been applied correctly by checking the target version and build id

./firefox-ui-update --installer %Firefox38.0b8% \
--update-target-version=38.0b9 --update-target-buildid=20150429135941.

Even if the tests pass without those options defined, it does not mean that the update was successful. You really want to check that Firefox got updated to the target build.

2. A beta user got updated to a RC build and has to be brought back to a beta build. Therefore you need --update-allow-mar-channel

Lets say a beta user was on 38.0b9 and got updated to 38.0RC. This RC build is on the release channel, but beta users have to be brought back to beta channel. Given that we currently cannot test updates in multiple steps vXb9 > -> RC -> v(X+1)b1, we have to run vXb9 -> RC, and RC -> v(X+1)b1 separately. Using the RC build as source build only has the release channel active. The option above can also enable the beta channel via update-settings.ini.

./firefox-ui-update --installer %Firefox38.0% --update-target-version=39.0b1
--update-target-buildid=%whateveritwillbe%
--update-allow-mar-channel=firefox-mozilla-beta

Archive - Failure modes

The failures reported below are known issues when the infrastructure does not behave as expected. All other harness issues have been dealt with.

AUS is temporarily unavailable

firefox-ui-update raises this exception:

    self.assertTrue(update_available)
AssertionError: False is not true

In the gecko log you will see Error: Unexpected node name, expected: updates, got: parsererror:

*** AUS:SVC Checker:checkForUpdates - sending request to: https://aus4.mozilla.org/update/3/Firefox/38.0/20150330154247/Linux_x86_64-gcc3/hu/beta-localtest/Linux%202.6.32-504.3.3.el6.x86_64%20(GTK%202.20.1)/default/default/update.xml?force=1
*** AUS:SVC Checker:onLoad - request completed downloading document
*** AUS:SVC Checker:_updates get - unexpected node name!
*** AUS:SVC Checker:onLoad - there was a problem checking for updates. Exception: Error: Unexpected node name, expected: updates, got: parsererror
*** AUS:SVC Checker:onLoad - request.status: 503
*** AUS:SVC getStatusTextFromCode - transfer error: A frissítés XML-fájlja nem található. (404), default code: 404
*** AUS:SVC recordInHealthReport - updateCheckFailed - 1503

The "unexpeced node name" comes from here: https://hg.mozilla.org/mozilla-central/annotate/50b95032152c/toolkit/mozapps/update/nsUpdateService.js#l3577

Originally discovered on bug 1152460.

Running out of space

If the machine running the tests runs out of space we can see this issue revealed on the Gecko log:

*** AUS:SVC readStatusFile - status: failed: 7, path: /tmp/tmpAyKRdJ.binary-update-tests/updates/0/update.status

Originall discovered on bug 1154060.

FTP/stage/S3 is unavailable temporarily OR wrong update channel

firefox-ui-update will raise an exception:

    self.assertTrue(update_available)
AssertionError: False is not true

Socket 2828 is unavailable

The socket for Marionette was left open from a previous run. This is not an infra issue unlike the other ones. This bug needs to be fixed on Marionette bug 1141519 and a work around will be placed through bug 1156475.

Exception raised:

   assert(self.wait_for_port()), "Timed out waiting for port!"
AssertionError: Timed out waiting for port!

You can see the socket by running this:

$ netstat -anp | grep ':2828 ' | grep TIME_WAIT
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:2828          127.0.0.1:47185         TIME_WAIT   -

This happens when "the Python client shuts down the socket abruptly (due to a global timeout on the operation, for instance)" bug 1141519

The only way to get back to a good known state is by waiting for that socket to timeout on its own.

Socket 2828 is unavailable

If you're running the suite by hand and you have aborted a previous run, you might have left firefox running. If you

MarionetteException: MarionetteException: localhost:2828 is unavailable.

You can see the socket held by firefox you can run this command:

$ netstat -anp | grep ':2828 '

You can fix it by running `killall -9 firefox`.