CI Automation/windows10 aarch64

From MozillaWiki
Jump to: navigation, search

Overview

Since mid-January 2019 the CI-A team has been working to enable existing test harnesses, continuous integration tests and other tools to run on Windows 10 ARM64, aka aarch64.

General Information

Hardware

  • Make: Lenovo
  • Model: C630 YOGA
  • Processor: Qualcomm Snapdragon 850 3.0GHz
  • Cores: 8
  • Memory: 8GB
  • Disk: 128GB SSD

Hosting

Currently an array of ~30 machines are hosted at Bitbar in the United States.

Setup - local environment

Developers wishing to run tests locally have two methods.

Prequisites

  1. download and install Mozilla-Build 2.2.0

Using mozilla-build

This method uses a script to download test archives in order to run tests locally.

  1. download script for running mozharness on Yoga from bug 1520867
  2. place the test runner script in the C:\mozilla-build directory
  3. from treeherder, identify a changeset that contains a successful build-win64-aarch64/opt
  4. copy the task ID of the build
  5. invoke start-shell.bat, which will launch a bash-like commandline
  6. from mozilla-build directory, run the test runner script as follows:

bash script.sh task_id test_type <chunk_to_run> <total_chunks>

Example: bash script.sh Q-CE8DFvSAWmc08vw6bd6A xpcshell 1 8

Using mozilla-central

This method is taken from this guide and uses mozilla-central with a build artifact.

  1. invoke start-shell.bat, which will launch a bash-like commandline
  2. clone the repository using hg clone https://hg.mozilla.org/mozilla-central/
  3. run ./mach bootstrap and pick artifact build
  4. download python3 embeddable zip, then extract to mozilla-build/ directory
  5. remove this line
  6. download 32bit NodeJS zip and extract to .mozbuild/node
  7. inside mozilla-build, remove the directory named watchman
  8. rerun ./mach bootstrap
  9. run ./mach build

After the artifact build succeeds, it is possible to run most suites of tests as normal: ./mach mochitest <test_file>

CI environment

Tests that are run in Taskcluster environment against windows10-aarch64 execute using Taskcluster Generic-Worker. These are installed as a service on via OpenCloudConfig.

Using OpenCloudConfig

This is the method used in production.

Steps originally taken from 1520432.

$gitBranchOrRef = 'master'
Invoke-Expression (New-Object Net.WebClient).DownloadString(('https://raw.githubusercontent.com/mozilla-releng/OpenCloudConfig/{0}/userdata/rundsc.ps1?{1}' -f $gitBranchOrRef, [Guid]::NewGuid()))

Manually install Generic-Worker [Not recommended]

Follow these step to install Taskcluster Generic-Worker on the hardware, and have it launch as a service.

Instruction originally from 1522997.

Prerequisites

  • disable Windows S mode
  • disable User Account Control
  • disable Windows Firewall
  • download NSSM to C:\nssm-2.24\
  • create "Remote Desktop Users" group:
net localgroup "Remote Desktop Users" /add
  • log in to Taskcluster
  • request scope `assume:project:taskcluster:generic-worker-tester`

Steps

  1. download the current 386 release of `generic-worker-windows-386.exe` from taskcluster generic-worker.
  2. download the latest 386 version of livelog.exe and taskcluster-proxy.exe.
  3. create new directory C:\generic-worker.
  4. move the three executable files under C:\generic-worker.
  5. rename generic-worker-windows-386.exe to generic-worker.exe.
  6. generate two signing keys:
generic-worker new-openpgp-keypair --file <unique_file_name>
generic-worker new-ed25519-keypair --file <unique_file_name>
  1. create generic-worker.config and include the following:
"accessToken":                "<access token tied to taskcluster>",
"clientId":                   "<client ID tied to taskcluster>",
"ed25519SigningKeyLocation":  "<file location you wrote ed25519 private key in step 6>",
"livelogSecret":              "<any text>",
"provisionerId":              "test-provisioner",
"publicIP":                   "<ideally an IP address of one of your network interfaces>",
"rootURL":                    "https://taskcluster.net",
"workerGroup":                "test-worker-group",
"workerId":                   "test-worker-id",
"workerType":                 "<a unique string that only you will use for your test worker(s)>"
  1. launch cmd.exe with Administrator rights.
  2. cd c:\generic-worker
  3. generic-worker.exe install service --config generic-worker.config --nssm c:\nssm-2.24\win32\nssm.exe
  4. reboot once installed.
  5. launch cmd.exe with Administrator rights.
  6. sc query "Generic Worker"

Currently running on CI

Currently, all tests are running regularly on mozilla-central and try.

Run on try

This is probably what you came to the document for. How to run tests against the windows10-aarch64 hardware currently available.

Hardware is limited so please exercise caution when scheduling tests! A careless try will block many others. Only schedule jobs that are absolutely necessary.

Prerequisites

  • try access (commit access level 1)
  • up-to-date mozilla-central codebase

Steps

Note that on try, windows10-aarch64 is hidden by default; please use ./mach try fuzzy --full to schedule jobs.

  1. ./mach try fuzzy --full
  2. select tests that need to be run (e.g. 'windows10-aarch64 xpcshell')
  3. enter

Tests will appear in Treeherder under the heading Windows 10 AArch64 opt.

Greening tests

Since Windows on ARM64 is a new platform/architecture combination, failures unique to this combination is to be expected. It will be necessary to fix, correct or update the tests in order to obtain a green run.

Example 1

As part of 1525743, the timeout for mochitest-browser-chrome was extended to 4x the default value if the platform combination of Windows and ARM64 is detected.

See change: https://phabricator.services.mozilla.com/D19882

This change greened the test that was previously failing due to a timeout.

Example 2

Some tests provide a manifest file in the form of <test_category>.ini, such as mochitest.ini.

For bug 1525665 it was determined to disable a certain a11y test while windows10-aarch64 a11y support was being investigated.

See change: https://phabricator.services.mozilla.com/D22363

This change meant the failing test is now disabled for windows10-aarch64, and the test would have been green had it not been for another failure elsewhere.

Example 3

Another example of manipulating the manifest of a category of tests, this time with web-platform-tests.

For bug 1533912, the manifest was modified to disable the test if it was running on aarch64 hardware.

See change: https://phabricator.services.mozilla.com/D23003

Note that web-platform-tests use a slightly different format in order.

Example 4

Certain test cases in reftest/crashtest/jsreftest had unexpected outcomes on windows10-aarch64.

For bug 1536365 and bug 1536363, the requirement was to adjust the pixel-difference values such that tests will pass.

See change: https://phabricator.services.mozilla.com/D25113

Bugs

These are the top-level tracking bugs; the recommended view is tree (login required).

CI-A team will make efforts to re-test disabled tests on a semi-regular basis, or whenever fixes are committed to components that had tests disabled.

Full Query
ID Summary Priority Status
1520867 Investigate running tests on Windows / arm64 P1 RESOLVED
1523722 Run gtest using generic-worker on Windows/aarch64 P3 RESOLVED
1524114 Run xpcshell-test using generic-worker on Windows/aarch64 P3 RESOLVED
1524400 Run mochitest using generic-worker on windows/aarch64 P3 RESOLVED
1524410 Run reftest suites using generic-worker on windows/aarch64 P3 RESOLVED
1525118 [meta] Run taskcluster task from mach try on Bitbar -- RESOLVED
1525434 Run web-platform-test suite using generic-worker on windows/aarch64 P3 RESOLVED
1526015 Run cppunit, jittest, marionette using generic-worker on Windows/aarch64 P3 RESOLVED
1527177 Intermittent [taskcluster:error] [mounts] reading file in zip archive: file already exists: Z:\task_1549919043\mozharness\LICENSE P5 RESOLVED
1527469 Enable windows10-aarch64 build and tests on try server -- RESOLVED
1530737 unable to run talos/raptor on win/aarch64 builds in CI -- RESOLVED
1531876 run talos/raptor tests on windows10 aarch64 laptops P1 RESOLVED
1531878 [taskcluster:error] [mounts] reading file in zip archive: file already exists: C:\tasks\task_1551392763\mozharness\LICENSE P1 RESOLVED
1531927 [meta] windows/aarch64 - skipped/disabled media tests P5 REOPENED
1533114 [meta] windows/aarch64 - skipped/disabled a11y tests P5 NEW
1533880 [meta] windows/aarch64 - skipped/disabled web-platform-tests P5 NEW
1534823 [meta] windows/aarch64 - skipped/disabled mochitest tests P5 NEW
1535467 windows/aarch64 - test screenshots sometimes show "Windows Defender Firewall has blocked some features of this app" P3 ASSIGNED
1536208 [meta] windows/aarch64 - skipped/disabled xpcshell tests P5 NEW
1536283 [meta] windows/aarch64 - skipped/disabled marionette tests P5 NEW
1536354 [meta] windows/aarch64 - skipped/disabled reftests P5 NEW
1538785 windows/aarch64 - plugin tests failing on windows10-aarch64 -- RESOLVED
1539693 windows/aarch64 - re-enable/adjust web-platform-tests results based on new timeout multiplier -- RESOLVED
1540213 windows/aarch64 - enable tests for windows10-aarch64 on taskgraph -- RESOLVED
1543521 windows/aarch64 - lower windows10-aarch64 to tier 2 on try -- RESOLVED
1545810 windows/aarch64 - web platform test chunk investigation -- RESOLVED
1546532 windows/aarch64 - enable mochitest-a11y -- RESOLVED
1546728 windows/aarch64 - enable cppunit -- RESOLVED
1546732 windows/aarch64 - enable jittest -- RESOLVED
1547820 windows/aarch64 - testing/web-platform/tests/media-source crashes on ARM64 -- RESOLVED
1552051 windows/aarch64 - run SM(p) instead of jittest P2 RESOLVED
1572185 Re-enable CSS web-platorm-tests for windows10-aarch64 -- RESOLVED

32 Total; 8 Open (25%); 24 Resolved (75%); 0 Verified (0%);