JSThreadsAndGC: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 23: Line 23:
The next big step after that is generational mostly-copying GC. We want that too but it's another similar amount of work, I think.
The next big step after that is generational mostly-copying GC. We want that too but it's another similar amount of work, I think.


Separately, it seems like a good idea to work on local optimizations. For example, Gregor noticed that we spend a lot of time in object finalization. Part of that time is to deal with custom finalizers, which most objects don't have. It might be faster if we allocated objects with custom finalizers in a separate GC arena. And Igor has suggested sweeping the non-custom objects in the background thread.
There will also be opportunities to optimize the GC independent of these major overhauls. I don't know what they all are. We'll find them as the benchmark suite gets finished and we start measuring. For example, Gregor noticed that we spend a lot of time in object finalization. Part of that time is to deal with custom finalizers, which most objects don't have. It might be faster if we allocated objects with custom finalizers in a separate GC arena. And Igor has suggested sweeping the non-custom objects in the background thread.


{| width="80%" cellspacing="1" cellpadding="6" border="0"
{| width="80%" cellspacing="1" cellpadding="6" border="0"

Revision as of 13:51, 23 April 2010

Motivation

  • More robust thread safety API
  • Reduce GC pauses
  • Minor performance win from eliminating object locking

Threading. In short, SM's support for sharing objects among threads has not gotten the maintenance attention it needs. It's accumulating bugs. At the same time, it is now widely acknowledged that shared-everything threads are a problematic programming model. Better ideas (like HTML5 Workers) are emerging. We need new JSAPI support for shared-nothing threads.

GC. At the same time our GC performance is not where we want it. There are still direct optimization opportunities. Beyond that, we want to be able to perform GC on a single browser tab, for shorter pause times when only one tab is active.

What these two things have in common is that they both involve dividing the JSRuntime's object graph into separate, mostly-isolated compartments, such that every object is in a compartment, and an ordinary object cannot have a direct reference to an object in a different compartment. (We will support special wrapper objects that provide transparent access to objects in other compartments.)

The changes we need to make for single-tab GC will simultaneously make it easier to add assertions enforcing the new rules about sharing objects across threads. Conveniently, in the browser, wrapper objects already exist in almost all the right places--it's a critical security boundary in the browser, and the wrappers impose various security policies.

Per-tab GC

These plans are limited to bugs that are on the critical path to per-tab GC, which we hope will dramatically reduce pause time.

The next big step after that is generational mostly-copying GC. We want that too but it's another similar amount of work, I think.

There will also be opportunities to optimize the GC independent of these major overhauls. I don't know what they all are. We'll find them as the benchmark suite gets finished and we start measuring. For example, Gregor noticed that we spend a lot of time in object finalization. Part of that time is to deal with custom finalizers, which most objects don't have. It might be faster if we allocated objects with custom finalizers in a separate GC arena. And Igor has suggested sweeping the non-custom objects in the background thread.

Name Size (weeks) Assigned to (guess)
Benchmarks 1? jorendorff/gwagner
Compartments and wrappers API 0.5 jorendorff
Compartment assertions 1 jorendorff
GCSubheaps ? gwagner
MT wrappers 3 jorendorff
Lock-free allocation and slot access 1 jorendorff
Compartmental GC ? gwagner

Benchmarks — In bug 548388, Gregor is building a GC benchmark suite. We need a way to turn a crank and get GC performance numbers from that. sayrer should get mail when we make these numbers move. This means we need to be able to measure GC performance in opt builds. Total time spent in GC and max pause time are cheap enough to collect.

Compartments and wrappers API — Add API for creating a global object and associating it with a compartment. Use it in Gecko everywhere we create a global. Add minimal API for a special kind of object that is allowed to hold a strong reference across compartments (a wrapper object). Use it in XPConnect for our security wrappers.

Compartment assertions — Add assertions that there are no direct references across compartments, both at API boundaries and within the engine. Fix what breaks. In particular, wrappers and the structured clone algorithm will need to copy strings and doubles instead of passing them freely from one compartment to another.

GCSubheaps — Factor GC-related code into a class, js::GCHeap (bug 556324). Carve out a second class, GCSubheap, so that a single GCHeap can have several GCSubheaps, each of which handles its own set of VM pages from which individual GC things may be allocated. Give each compartment its own GCSubheap. Allocate every object, double, and string from the GCSubheap for the compartment where it will live.

MT wrappers — Implement an automatically-spreading membrane of proxy objects that allow one thread to access objects from another compartment that is running in another thread. There is some risk here because it's unclear how this should work regarding objects on the scope chain (global objects, Call and Block objects). See bug 558866 comments 1-4.

Lock-free allocation and slot access — Remove locking from allocation paths. Remove scope locking everywhere. (bug 558866)

Compartmental GC — Support collecting garbage in one GCSubheap without walking the rest of the graph (bug 558861) and without stopping other threads.