Project Fission/Memory

Goals

Due to the drastic increase in the number of processes required to support Project Fission we must focus on reducing the per-process overhead of each content process. As a baseline, we are working on reducing the amount of memory used to load an about:blank page.

We should also look at per-content process overhead in the parent. One example of this is IPDL protocols. We are not currently measuring this.

Measurement tools

about:memory

This is the default tool for looking at memory usage. One approach to finding targets for memory reduction is to look at the about:memory report for a small content process (for instance, example.com) and see what is using up memory. In addition to using it to guide memory reduction work, we can work to reduce the heap-unclassified number (memory allocated by Firefox but not reported in about:memory) in tests of interest. Adding new reporters often leads to finding things to improve.

DMD

DMD (aka the Dark Matter Detector) complements about:memory. When DMD mode is enabled, Firefox tracks the address and allocation stacks of all live blocks of memory and can save that information to a log for offline analysis by dmd.py. The primary use is investigating heap-unclassified, but with alternate modes enabled it can be used for other tasks like investigating leaks.

GC logs

GC logs, which can be saved via the about:memory page, record information about live JS objects. As JS is a major source of memory overhead, having a detailed understanding of where the memory going is very useful. There are a number of scripts available to help analyze these log files.

One interesting variation of this is to run with MOZ_GC_LOG_SIZE=1, which leverages the DevTools UbiNode work to record the size of individual objects in the GC log. You can then use those logs with the dominator tree based analysis in dom_tree.py to get fine-grained information about the memory overhead of chrome JS, down to the individual function level.

Metrics

Binary Section Sizes

Any writable data will increase the overhead of each content process, this is particularly problematic on Linux and OSX. This data is tracked via section_size entries in build_metrics in perfherder.

Are We Slim Yet Tests

The primary tests used by MemShrink are the Are We Slim Yet (AWSY) suite. Metrics are available in perfherder under the awsy framework.

AWSY (sb)

Our main focus for Fission is the simpler about:blank SY-e10s(ab) test. This gives a less noisy baseline metric that allows us to focus on incremental improvements. This is essentially our best case metric.

Metrics

Resident Unique is our main measurement of success. It's the total amount of memory that the content process is using that's not shared with other processes:

JavaScript is memory used by our JavaScript engine both for it's internal state and any loaded scripts. It is one of the largest contributors to memory overhead, has very little noise, and has been a focus for memory reduction:

JavaScript Memory - Linux
JavaScript Memory - Windows
JavaScript Memory - OSX
JavaScript Memory - Regression view
This is useful for tracking down where a regression started by breaking out integration branches. It's limited to just Linux, but JS is generally platform agnostic.

Testing your changes

To run locally use:

./mach awsy-test testing/awsy/awsy/test_base_memory_usage.py

To run on try for comparisons use:

hg up base_revision
./mach try -b o -p linux64 -u awsy-base-e10s -t none --rebuild 5
hg up new_revision
./mach try -b o -p linux64 -u awsy-base-e10s -t none --rebuild 5

AWSY (sy)

The original test suite, SY-e10s(sy), is a stress test that loads 100 pages into 30 tabs 3 times and measures memory at various points and is useful for detecting regressions in startup memory usage, longer term leaks, leaked windows, etc. This is essentially our worst case metric.

Metrics

after tabs opened is one of several data points gathered, it is the most relevant to this project:

Testing your changes

To run locally use:

./mach awsy-test

To run on try for comparisons use:

hg up base_revision
./mach try -b o -p linux64 -u awsy-e10s -t none --rebuild 5
hg up new_revision
./mach try -b o -p linux64 -u awsy-e10s -t none --rebuild 5

AWSY (tp6)

An updated test suite, SY-e10s(tp6), is a test that loads the entire tp6 pageset and measures memory at various points. The tp6 pageset uses mitmproxy to simulate live connections which will allow us to properly test out-of-process iframes.

Metrics

Currently we're not tracking these metrics, they'll be more interesting once we enable fission.

Testing your changes

To run locally use:

./mach awsy-test --tp6 --preferences testing/awsy/conf/tp6-prefs.json

To run on try for comparisons use:

hg up base_revision
./mach try fuzzy --rebuild 5 -q "'linux64/opt-awsy-tp6"
hg up new_revision
./mach try fuzzy --rebuild 5 -q "'linux64/opt-awsy-tp6"

Bug Tracking

Tracking: bug 1436250

Triage

We use the [overhead] whiteboard tag to flag items for triage. Additionally any bug in the bug 1436250 dependency tree is generally triaged. The triage process attempts to estimate the impact a bug will have on reducing the overhead of a content process. If we think a bug will reduce per-process memory usage by 30KB then update the tag with the expected win: [overhead:30K]. We try to use a reasonable guess for this value, it doesn't need to be exact.

Progress

Measurements from the beginning of each quarter.

Linux

Quarter	Resident Unique	JS
Q2 2018	34.3MB	8.5MB
Q3 2018	33.6MB	8.0MB
Q4 2018	20.7MB	5.3MB
Q4.5 2018	17.8MB	4.6MB
Q1 2019*	19.5MB	5.0MB
Q2 2019	19.5MB	4.1MB
Q3 2019	19.5MB	4.1MB

Windows

Quarter	Resident Unique	JS	Resident Unique (WebRender)
Q2 2018	21.5MB	8.5MB	N/A
Q3 2018	20.8MB	8.0MB	N/A
Q4 2018	15.0MB	5.3MB	17.1MB
Q4.5 2018	15.0MB	4.8MB	17.1MB
Q1 2019*	17.8MB	5.1MB	14.5MB
Q2 2019	15.1MB	4.1MB	14.8MB
Q3 2019	15.1MB	4.1MB	14.8MB

OSX

Quarter	Resident Unique	JS
Q2 2018	32.7MB	8.5MB
Q3 2018	31.7MB	8.0MB
Q4 2018	21.4MB	5.3MB
Q4.5 2018	20.4MB	4.6MB
Q1 2019*	25.2MB	5.0MB
Q2 2019	24.6MB	4.1MB
Q3 2019	24.6MB	4.1MB

Q1 2019 note: We switched to VMs with GPUs to better simulate real-world conditions. This resulted in apparent regressions across the board.