QA/Platform/Graphics/Quantum/Renderer: Difference between revisions

From MozillaWiki
< QA‎ | Platform‎ | Graphics
Jump to navigation Jump to search
Line 122: Line 122:
* Reftests: all existing plus any new WR-related tests
* Reftests: all existing plus any new WR-related tests
* Talos: not needed until correctness is achieved
* Talos: not needed until correctness is achieved
* XPCOM: n/a
* Fuzzing: needed
* Fuzzing: yes we want this
* Bughunter: ?
* Bughunter: ?
* Crashtests: needed
* XPCShell: n/a


; Considerations
; Considerations

Revision as of 23:06, 3 March 2017

Documentation

Components

Component OKRs Target Risks Coverage
Layer Manager Automation: TBD
Manual: TBD
Windows Support Automation: TBD
Manual: TBD
Texture Sharing Automation: TBD
Manual: TBD
APZ Automation: TBD
Manual: TBD
OMTC Automation: TBD
Manual: TBD
OMTA Automation: TBD
Manual: TBD
Images Automation: TBD
Manual: TBD
Video Automation: TBD
Manual: TBD
Compositor Process Automation: TBD
Manual: TBD
Display Items Automation: TBD
Manual: TBD
Android Support Automation: TBD
Manual: TBD
Software Fallback Automation: TBD
Manual: TBD
Performance / Memory Automation: TBD
Manual: TBD
Improvements Automation: TBD
Manual: TBD
WebGL Automation: TBD
Manual: TBD

Automation Review

Status
  • Currently there are 138 failing reftests as of 2017-03-01 with WebRenderer enabled (tracking - metabug - wiki - treeherder). Once these are fixed we'll want to run all reftests with WebRenderer enabled and with WebRenderer disabled which, since WebRenderer depends on e10s, we'll need to run the reftests 3x (non-e10s, e10s+WR, e10s-WR).
  • From a platform perspective we will only be concerned with Windows 7 Platform Update and later with D3D11 enabled. This means that, at least for now, we will not need to double-up tests on Mac, Linux, Android, and older Windows environments. However, this also means that we'll have to support doubling-up reftests until such time as we drop support for non-WebRenderer supported hardware/platforms.
Modeling based on e10s

Long before we enabled e10s by default for eligible Nightly users, we ran all automated tests in both e10s and non-e10s mode. A huge number of tests had to be rewritten to be e10s-compatible (because many of the tests called APIs that would become asynchronous or not allowed from the content process). After we launched e10s, many users were still not eligible because they used add-ons or a11y. As we've increased the number of users eligible, we've had to keep running tests in e10s and non-e10s mode because they are both supported configurations.

QR will only support certain GPUs (D3D11+) and OS versions (Win7PU+), so we will need to run some subset of tests in both QR-enabled and QR-disabled mode, perhaps indefinitely. The QR test plan should document which platforms won't support QR and which subset of tests will be most useful to run in both QR-enabled and QR-disabled modes.

Builds
  • Kats added a QR-enabled Linux build on mozilla-central for a subset of gfx tests.
  • Building WebRenderer on nightly/local builds is in progress via bug 1342450
  • Need to get tests working on Windows
Tests
  • Mochitests: GPU/GL/APZ related only
  • Reftests: all existing plus any new WR-related tests
  • Talos: not needed until correctness is achieved
  • Fuzzing: needed
  • Bughunter: ?
  • Crashtests: needed
  • XPCShell: n/a
Considerations
  • How long do graphics-related tests take to run on average?
    ??? ...checking with catlee... ???
  • How much overhead will QR tests impose?
    Near-zero
  • Which platforms need to be supported
    Prefer Windows 10 but can accept Windows 8 as Tier-1, Linux & MacOS are Tier-2
  • Which build types need to be supported
    Debug, Opt
  • Which hardware needs to be supported
    Prefer real hardware (intel, amd, and nvidia)
  • Which HWA modes need testing?
    WebRenderer, DirectX, Basic
  • How long will we need to support hardware/platforms unsupported by QR?
    ...?
  • How long will we need to support Gecko under QR supported platforms/hardware?
    ...?
  • How much does Bughunter testing cost us and do we need to test in QR/non-QR builds?
    ...?
  • How much does performance testing cost us and do we need to test in QR/non-QR builds?
    ...?
Fuzzing
Relevant Links

Testplan Template

Based on Stylo Plan

Risk Analysis

Risk Mitigation Timeline
Web Compatibility

(list any risks to web-compat that need to be mitigated)

(eg. automation, fuzzing, manual testing, a/b testing) (eg. when to implement and start monitoring each mitigation strategy)
Performance

(list any risks to user-perceived performance and justification to switch to new system)

(eg. automated tests, benchmarks, manual testing, user studies, measuring against "frame budget") (eg. when to implement and start monitoring each mitigation strategy)
Stability

(list any risks to crash rate, data loss, rendering correctness, etc)

(eg. automated tests, data monitoring, static analysis, fuzzing, crawling, etc) (eg. when to implement and start monitoring each mitigation strategy)
Memory

(list any risks to memory footprint, installer size, etc)

(eg. tests, data monitoring, etc) (eg. when to implement and start monitoring each mitigation strategy)
Hardware Compatibility

(list any risks to reduction in accelerated content related to hardware and blocklisting)

(eg. automated tests, manual testing, data monitoring, etc) (eg. when to implement and start monitoring each mitigation strategy)

Scope of Testing

  • Platform coverage
  • Hardware coverage
  • Usecase coverage

Automated Testing

  • Test suites (eg. reftests, mochitests, xpcom, crash tests, code coverage, fuzzing, perf, code size, etc)
  • Benchmarking (eg. first-party, third-party, comparison to non-WR, comparison to competition)
  • What are the questions we want to answer, why do we care, and how will we measure?

Manual Testing

  • Exploratory (eg. top-sites, UI elements, high-contrast themes, hiDPI displays, switching GPUs, zooming & scrolling, a11y, rtl locales, printing, addon compat, security review, etc)
  • A/B testing of new vs old
  • New use cases which might apply
  • Hardware/driver/platform compatibility to inform expanding/retracting ship targets

Integration Testing

  • Criteria for enabling on Nightly (eg. all automation passing)
  • Telemetry experimentation (eg. crash rates, user engagement via page views or scrolling, WR-specific probes)
  • Any blockers for running tests
  • Ensuring RelMan / RelQA sign-off test plan and execution prior to riding the trains
  • Does it impact other project areas (eg. WebVR, Stylo, etc)?

Out of Scope

  • What is not in scope for testing and/or release criteria?
  • Are there things we won't do, like testpilot or shield studies?
  • Are there things we won't test, like specific hardware we don't have access to?
  • Will we do a staged rollout vs normal rollout?
  • Do we care about edge-case behaviours and/or user profiles (eg. addons, themes, configurations, etc)?
  • Do we care about edge-case environments (eg. VMs, Bootcamps, old drivers, etc)?