QA/Platform/Graphics/Quantum/Renderer: Difference between revisions

Revision as of 23:06, 3 March 2017

Documentation

Components

Component	OKRs	Target	Risks	Coverage
Layer Manager				Automation: TBD Manual: TBD
Windows Support				Automation: TBD Manual: TBD
Texture Sharing				Automation: TBD Manual: TBD
APZ				Automation: TBD Manual: TBD
OMTC				Automation: TBD Manual: TBD
OMTA				Automation: TBD Manual: TBD
Images				Automation: TBD Manual: TBD
Video				Automation: TBD Manual: TBD
Compositor Process				Automation: TBD Manual: TBD
Display Items				Automation: TBD Manual: TBD
Android Support				Automation: TBD Manual: TBD
Software Fallback				Automation: TBD Manual: TBD
Performance / Memory				Automation: TBD Manual: TBD
Improvements				Automation: TBD Manual: TBD
WebGL				Automation: TBD Manual: TBD

Automation Review

Status

Currently there are 138 failing reftests as of 2017-03-01 with WebRenderer enabled (tracking - metabug - wiki - treeherder). Once these are fixed we'll want to run all reftests with WebRenderer enabled and with WebRenderer disabled which, since WebRenderer depends on e10s, we'll need to run the reftests 3x (non-e10s, e10s+WR, e10s-WR).
From a platform perspective we will only be concerned with Windows 7 Platform Update and later with D3D11 enabled. This means that, at least for now, we will not need to double-up tests on Mac, Linux, Android, and older Windows environments. However, this also means that we'll have to support doubling-up reftests until such time as we drop support for non-WebRenderer supported hardware/platforms.

Modeling based on e10s

Long before we enabled e10s by default for eligible Nightly users, we ran all automated tests in both e10s and non-e10s mode. A huge number of tests had to be rewritten to be e10s-compatible (because many of the tests called APIs that would become asynchronous or not allowed from the content process). After we launched e10s, many users were still not eligible because they used add-ons or a11y. As we've increased the number of users eligible, we've had to keep running tests in e10s and non-e10s mode because they are both supported configurations.

QR will only support certain GPUs (D3D11+) and OS versions (Win7PU+), so we will need to run some subset of tests in both QR-enabled and QR-disabled mode, perhaps indefinitely. The QR test plan should document which platforms won't support QR and which subset of tests will be most useful to run in both QR-enabled and QR-disabled modes.

Builds

Kats added a QR-enabled Linux build on mozilla-central for a subset of gfx tests.
Building WebRenderer on nightly/local builds is in progress via bug 1342450
Need to get tests working on Windows

Tests

Mochitests: GPU/GL/APZ related only
Reftests: all existing plus any new WR-related tests
Talos: not needed until correctness is achieved
Fuzzing: needed
Bughunter: ?
Crashtests: needed
XPCShell: n/a

Considerations

How long do graphics-related tests take to run on average?
??? ...checking with catlee... ???
How much overhead will QR tests impose?
Near-zero
Which platforms need to be supported
Prefer Windows 10 but can accept Windows 8 as Tier-1, Linux & MacOS are Tier-2
Which build types need to be supported
Debug, Opt
Which hardware needs to be supported
Prefer real hardware (intel, amd, and nvidia)
Which HWA modes need testing?
WebRenderer, DirectX, Basic
How long will we need to support hardware/platforms unsupported by QR?
...?
How long will we need to support Gecko under QR supported platforms/hardware?
...?
How much does Bughunter testing cost us and do we need to test in QR/non-QR builds?
...?
How much does performance testing cost us and do we need to test in QR/non-QR builds?
...?

Fuzzing

QR will do Bughunter testing via bug 1336436

Relevant Links

Testplan Template

Based on Stylo Plan

Risk Analysis

Risk	Mitigation	Timeline
Web Compatibility (list any risks to web-compat that need to be mitigated)	(eg. automation, fuzzing, manual testing, a/b testing)	(eg. when to implement and start monitoring each mitigation strategy)
Performance (list any risks to user-perceived performance and justification to switch to new system)	(eg. automated tests, benchmarks, manual testing, user studies, measuring against "frame budget")	(eg. when to implement and start monitoring each mitigation strategy)
Stability (list any risks to crash rate, data loss, rendering correctness, etc)	(eg. automated tests, data monitoring, static analysis, fuzzing, crawling, etc)	(eg. when to implement and start monitoring each mitigation strategy)
Memory (list any risks to memory footprint, installer size, etc)	(eg. tests, data monitoring, etc)	(eg. when to implement and start monitoring each mitigation strategy)
Hardware Compatibility (list any risks to reduction in accelerated content related to hardware and blocklisting)	(eg. automated tests, manual testing, data monitoring, etc)	(eg. when to implement and start monitoring each mitigation strategy)

Scope of Testing

Platform coverage
Hardware coverage
Usecase coverage

Automated Testing

Test suites (eg. reftests, mochitests, xpcom, crash tests, code coverage, fuzzing, perf, code size, etc)
Benchmarking (eg. first-party, third-party, comparison to non-WR, comparison to competition)
What are the questions we want to answer, why do we care, and how will we measure?

Manual Testing

Exploratory (eg. top-sites, UI elements, high-contrast themes, hiDPI displays, switching GPUs, zooming & scrolling, a11y, rtl locales, printing, addon compat, security review, etc)
A/B testing of new vs old
New use cases which might apply
Hardware/driver/platform compatibility to inform expanding/retracting ship targets

Integration Testing

Criteria for enabling on Nightly (eg. all automation passing)
Telemetry experimentation (eg. crash rates, user engagement via page views or scrolling, WR-specific probes)
Any blockers for running tests
Ensuring RelMan / RelQA sign-off test plan and execution prior to riding the trains
Does it impact other project areas (eg. WebVR, Stylo, etc)?

Out of Scope

What is not in scope for testing and/or release criteria?
Are there things we won't do, like testpilot or shield studies?
Are there things we won't test, like specific hardware we don't have access to?
Will we do a staged rollout vs normal rollout?
Do we care about edge-case behaviours and/or user profiles (eg. addons, themes, configurations, etc)?
Do we care about edge-case environments (eg. VMs, Bootcamps, old drivers, etc)?

@@ Line 122: / Line 122: @@
 * Reftests: all existing plus any new WR-related tests
 * Talos: not needed until correctness is achieved
-* XPCOM: n/a
+* Fuzzing: needed
-* Fuzzing: yes we want this
 * Bughunter: ?
+* Crashtests: needed
+* XPCShell: n/a
 ; Considerations

QA/Platform/Graphics/Quantum/Renderer: Difference between revisions

Revision as of 23:06, 3 March 2017

Contents

Documentation

Components

Automation Review

Testplan Template

Risk Analysis

Scope of Testing

Automated Testing

Manual Testing

Integration Testing

Out of Scope

Navigation menu

QA/Platform/Graphics/Quantum/Renderer: Difference between revisions

Revision as of 23:06, 3 March 2017

Documentation

Components

Automation Review

Testplan Template

Risk Analysis

Scope of Testing

Automated Testing

Manual Testing

Integration Testing

Out of Scope

Navigation menu

Search