JavaScript:SpiderMonkey:GC Futures

From MozillaWiki
Jump to: navigation, search

JS GC Futures

Tracking bug is bug 505308

Brain-dump of work items:

  • Speed up allocator
    • Remove reserved objects and doubles stuff in the tracer bug 508140
    • Use one single GC heap chunk, avoiding frequent mmap and malloc calls bug 508707
  • Speed up collector
    • Allocate short-enough strings from GC heap, not malloc heap bug 402614
    • Based on Gregor's stats consider making 32-byte and 64-byte JSObjects to cover most cases except large objects without any dslots bug 508357
    • Schedule GC based on memory pressure bug 506125
  • Conservative stack scanning to avoid temp-value rooting overheads? bug 516832
  • Go to beach

Compartments and per-compartment GC

Motivation

  • More robust thread safety API
  • Reduce GC pauses
  • Minor performance win from eliminating object locking

Threading. In short, SM's support for sharing objects among threads has not gotten the maintenance attention it needs. It's accumulating bugs. At the same time, it is now widely acknowledged that shared-everything threads are a problematic programming model. Better ideas (like HTML5 Workers) are emerging. We need new JSAPI support for shared-nothing threads.

GC. At the same time our GC performance is not where we want it. There are still direct optimization opportunities. Beyond that, we want to be able to perform GC on a single browser tab, for shorter pause times when only one tab is active.

What these two things have in common is that they both involve dividing the JSRuntime's object graph into separate, mostly-isolated compartments, such that every object is in a compartment, and an ordinary object cannot have a direct reference to an object in a different compartment. (We will support special wrapper objects that provide transparent access to objects in other compartments.)

The changes we need to make for single-tab GC will simultaneously make it easier to add assertions enforcing the new rules about sharing objects across threads. Conveniently, in the browser, wrapper objects already exist in almost all the right places--it's a critical security boundary in the browser, and the wrappers impose various security policies.

Plans

These plans are limited to bugs that are on the critical path to per-tab GC, which we hope will dramatically reduce pause time.

Name Size (weeks) Assigned to (guess)
Benchmarks 1 gwagner
Benchmark automation 0.5 jorendorff
Compartments and wrappers API 1 jorendorff
Compartmentalize Gecko 3 jorendorff
GCSubheaps 2-3 gwagner
MT wrappers 3 gal
Lock-free allocation and slot access 1 gal
Compartmental GC done by the end of June gwagner

Benchmarks — Gregor is building a GC benchmark suite. We need it checked in. (bug 548388)

Benchmark automation — We need to be able to turn a crank and get GC performance numbers. Talos needs to run this automatically. This means we need to be able to measure GC performance in opt builds. Total time spent in GC and max pause time are cheap enough to collect. (bug 561486)

Compartments and wrappers API — Add API for creating a global object and associating it with a compartment. Add minimal API for a special kind of object that is allowed to hold a strong reference across compartments (a wrapper object). Add assertions within the engine that there are no direct references across compartments. Add assertions at API boundaries that all the gc-things provided as arguments come from the same compartment. (bug 563099)

Compartmentalize Gecko — Use the compartment API to divide up Gecko so that objects with different principals are always in different compartments. Use the wrapper API in XPConnect for our security wrappers. Fix what breaks. In particular, wrappers and the structured clone algorithm will need to copy strings and doubles instead of passing them freely from one compartment to another. (bug 563106)

GCSubheaps — Factor GC-related code into a class, js::GCHeap (bug 556324). Carve out a second class, GCSubheap, so that a single GCHeap can have several GCSubheaps, each of which handles its own set of VM pages from which individual GC things may be allocated. Give each compartment its own GCSubheap. Allocate every object, double, and string from the GCSubheap for the compartment where it will live.

MT wrappers — Implement an automatically-spreading membrane of proxy objects that allow one thread to access objects from another compartment that is running in another thread. There is some risk here because it's unclear how this should work regarding objects on the scope chain (global objects, Call and Block objects). See bug 558866 comments 1-4. See also bug 566951, which has a prototype patch that addresses the scope chain issue.

Lock-free allocation and slot access — Remove locking from allocation paths. Remove scope locking everywhere. (bug 558866)

Compartmental GC — Support collecting garbage in one GCSubheap without walking the rest of the graph (bug 558861) and without stopping other threads.



Compartments and wrappers - API

Each runtime has a default compartment which contains interned strings, the empty string, +Inf, -Inf, NaN—and, in non-compartment-aware embeddings, everything else.

Each context has a current compartment, initially the default compartment, and normally the compartment of JS_GetScopeChain(cx). So, for example, js_Atomize(cx, name, strlen(name), 0) allocates the new string from cx->currentCompartment().

A tricky consequence of this is that in an API call JS_SetProperty(cx, obj, name, vp), obj must be in cx->currentCompartment(), because JS_SetProperty calls js_Atomize to create the property id. Otherwise obj could end up with property ids that reside in cx->currentCompartment() rather than its own compartment: a violation of the rules that will lead to a crash in GC.

When does a context's current compartment need to change? Only when setting up a new compartment and when calling across compartment boundaries. The latter always happens in wrapper code. So this will be rare and we can require an API call and fall off trace when it happens.

JSCompartment *
JS_GetDefaultCompartment(JSRuntime *rt);

JSCompartment *
JS_NewCompartment(JSRuntime *rt, JSPrincipals *principals);

JSCompartment *
JS_GetCurrentCompartment(JSContext *cx);

void
JS_SetCurrentCompartment(JSContext *cx, JSCompartment *compartment);

All existing APIs could just assert that cx->currentCompartment() agrees with all the arguments that happen to be gc-things. The precise rule is: each gc-thing passed in must either be in cx->currentCompartment() or be a string or double in cx->runtime->defaultCompartment.

GC roots will be per-compartment. This means JS_AddGCRootRT will add a root to the default compartment. This behavior is a bit unexpected. To help embeddings get this right, GC should assert that each gc thing pointed to by a root is in the expected compartment.

API functions related to the GC heap. Several API functions do something that involves the GC heap: JS_GC and friends; JS_SetGCCallback; JS_SetGCParameter; JS_TraceChildren and friends; JS_DumpHeap; JS_SetGCZeal. These will need to have a mode for collecting/walking the entire heap and a separate mode where they apply to just one compartment. TBD.

Wrapper API. Since the cross-compartment reference from a wrapper to the wrappee is so special, we will need API for it. TBD.

Emerging Invariants

This section describes invariants and rules which have emerged during initial development of the conservative GC and the compartments code. They are not likely to change, but still may.

  • The C stack is not scanned for GC roots when there are no contexts (suspended or otherwise) in requests on a given thread
  • When doing a single-compartment GC, only the current thread's stack is scanned (unless there are no contexts in requests on that thread)
  • A context's compartment is equal to JS_GetScopeChain(cx)->getCompartment. A NULL scope chain indicates the default compartment.
  • Corollary: All non-default compartments have at least a global object.
  • Only one thread per compartment may be in a request at any given time