JavaScript:ActionMonkey:Stage 0 Whiteboard
Stage 0: Replace SpiderMonkey's GC (jsgc) with Tamarin's GC (MMgc).
bug 387012 – ActionMonkey stage 0 tracking bug
The plan is to hollow out jsgc.c and replace it with a new implementation based on MMgc. The API in jsgc.h will be almost entirely preserved; but see "Features we will lose" below.
We'll use MMgc in non-incremental mode to start with. (It's easier. Avoid premature optimization.)
We'll build SpiderMonkey as C++, and some key types (JSObject, JSString) will become classes. This is unavoidable because MMgc's API for finalization is to subclass MMgc::GCFinalizedObject.
Features we will lose
JS_THREADSAFE - We will lose the threading model where a single JSRuntime has multiple threads, each with its own JSContext. (Firefox doesn't use this.)
ad hoc marking in JSGC_MARK_END - SpiderMonkey embedders can ("for API compatibility", according to a comment in jsgc.c) call JS_CallTracer from a JSGCCallback when it receives the JSGC_MARK_END event. I think the idea is something similar to SetGCExtraRoots(). Firefox does not use this; we will not try to preserve it. It is likely to break.
.plan
- bug 387938 – Modify MMgc to support ActionMonkey stage 0.
- bug 387935 – Always build JS as C++. (done)
- bug 387034 - Get the build system to build Tamarin and link JS against MMgc.
- Get at least sketchy answers to most of the questions below.
- Do a sprint with Mardak and Brendan (if he's available) to reimplement jsgc.cpp using MMgc.
- Debug and grow wise.
General issues
If there's anywhere a GCThing has pointers to non-GC-managed memory that contains pointers back into GC-managed memory, we have a problem. Exact GC lets you do that (you know how to traverse the non-GC-managed memory). MMgc doesn't.
I don't know that SpiderMonkey does this anywhere. If it does, one fix is to change those data structures to live in GC-managed memory. So the fix itself is easy: finding the offenders is the interesting bit.
XPConnect garbage-collects XPCWrapNative objects and its own data structures. This needs some more research. (Brendan thinks this won't affect stage 0 work.)
Specific jsgc.h APIs
What follows is a dump of everything exposed through jsgc.h. Each item is rated * (one star, an easy exercise); ** (two stars, fun little puzzle); *** (three stars, hmmmm, that's interesting). The ratings are jorendorff's guesses though. The wildest, least-educated guesses are marked with ?.
GCX_OBJECT...GCF_MUTABLE,js_GetGCThingFlags- **- Where do we put these flags? They are no longer needed for the GC itself, but non-GC-related functionality has been piggybacked on these flags, so we can't just get rid of them.
GCF_MARKis not used outside the garbage collector, butGCF_MUTABLE,GCF_LOCK, andGCF_SYSTEM(and maybe the type bits) are used (and not just read access--there's code outside jsgc.c that actually twiddles these bits). - Try putting them in the revised JSObject, JSString, etc. data structures that subclass the MMgc GCObject types. /be
- We can overallocate by a byte and put the flags bits alongside the object itself. That single byte costs us 8 bytes... but actually I'm leaning toward doing this, for now, silly as it is. It lets us leave the existing API alone, macros and all. We'll worry about all sorts of speed/space concerns in the next round.
- Right, essentially the same my suggestion above. /be
- GCF_LOCK flag is used to pin NaN, +-inf, and "" empty string, so perhaps create a GCRoot to point to these items. See
js_LockGCThingbelow.
- Where do we put these flags? They are no longer needed for the GC itself, but non-GC-related functionality has been piggybacked on these flags, so we can't just get rid of them.
js_GetGCStringRuntime- *GC::GetGC(const void *), then (a) decrement by some offset; or (b) just give the GC a pointer back to the runtime (that is, make a subclass ofMMgc::GCthat contains a pointer back to theJSRuntime, and use that).- There's only one JSRuntime in Firefox and other Gecko apps (used by XPConnect), so just make a singleton pointer associated with the GC instance. /be
GC_POKE- *- No-op for now. Its effect is read only by js_GC currently so since that function is gutted you can gut this use of rt->gcPoke. /be
- The MMgc equivalent is
DWB()and its ilk. In incremental mode, these are required. In non-incremental mode, they're only necessary if a finalizer might "resurrect" an object (that is, cause an unmarked object to become reachable). jsgc has never allowed this, so it's OK. -jto
js_ChangeExternalStringFinalizer- *- External strings can be implemented by a JSExternalString class with a destructor that consults the table of string finalizers.
- Emulate this on top of MMgc using a virtual method on JSString, which inherits from GCFinalizedObject. /be
js_InitGC...js_MapGCRoots- *- Core features of MMgc. There's no API in MMgc for enumerating roots, but we can either cheat (via
#define private protected) or keep our own list outside of MMgc. Anyway--not hard. - Defer for now. /be
- Core features of MMgc. There's no API in MMgc for enumerating roots, but we can either cheat (via
JSPtrTable,js_RegisterCloseableIterator,JSGCCloseState,js_RegisterGenerator,js_RunCloseHooks- *- These will become no-ops. All this has to do with iterator and generator cleanup, but these hooks are going away. See bug 380469. (Related bug: bug 349272.)
- No-op'ing should not cause leaks since (AFAIK) no chrome JS uses generators. /be
JSGCThing- **- This can probably go away. It's mentioned outside jsgc.c in two places: (a) in the context of
weakRoots.newborn, which I assume we'll keep, since the newborn guarantee is JSAPI-visible; and (b) in the declaration of JSContext, where they won't be needed anymore. - Both JSGCThing and cx->weakRoots should be removed. The latter should not be necessary given conservative stack scanning. /be
- This can probably go away. It's mentioned outside jsgc.c in two places: (a) in the context of
GC_NBYTES_MAX,GC_NUM_FREELISTS,GC_FREELIST_NBYTES,GC_FREELIST_INDEX- *- These can just go away.
js_NewGCThing- *- The only thing to worry about is the flags.
js_LockGCThing,js_LockGCThingRT,js_UnlockGCThingRT- **- This is the pinning API. Can be reimplemented on top of MMgc rooting. /be
- "make one big root for everything you can keep track of - one GCRoot for each runtime." Reuse the "constants/pinning" GCRoot for anything else that needs locking?
- See also: JS_LockGCThing API doc.
js_IsAboutToBeFinalized- *- We'll have to lay the hack on pretty thick to get this without modifying MMgc source code, but it can be done.
- or modify MMgc -- that can be done too in this stage 0 work I think. /be
IS_GC_MARKING_TRACER- *- Unchanged. This is an undocumented feature of the trace API.
js_GCwill do an exact trace followed by a call toMMgc::GC::Collect(). During that trace, this macro returns true.
- Unchanged. This is an undocumented feature of the trace API.
JSTRACE_FUNCTION...JSTRACE_XML- *- Unaffected. The tracing API uses these.
JS_IS_VALID_TRACE_KIND- *- Unaffected.
js_CallValueTracerIfGCThing- *- Likely unaffected.
js_TraceStackFrame,js_TraceRuntime,js_TraceContext- *- Probably unaffected.
JSGCInvocationKind,GC_NORMAL,GC_LAST_CONTEXT,GC_LAST_DITCH,js_GC- **- The GC "invocation kinds" need to be maintained somehow. That will require some study.
- If MMgc runs only a global mark and sweep in this stage 0 of ActionMonkey, then we can run out of memory (perhaps only after paging to death), and we do need to GC everything on last context destruction. So these should be kept as arguments to js_GC, and possibly even used in its new MMgc-based implementation. /be
- Mode
GC_LAST_DITCHis part of a mechanism to lock other threads out while collection is happening (possibly repeatedly); since we are losing the threading model, I don't think this does anything anymore. - The existing
js_GC()can restart GC for any of three reasons:- (easy) In
GC_LAST_CONTEXTmode,js_GC()simply collects repeatedly until no more garbage is collected. We will retain this behavior. - (subtler)
js_GCtakes the hint if a finalizer orGCCallbackcallsjs_GCrecursively. The recursive call just sets a flag, because GC is already underway; but after finalization,js_GCchecks the flag and, if it's set, restarts GC almost from the beginning. We can probably easily retain this. - (subtler)
js_GCalso restarts in the same way if thegcPokeflag is set. (This flag indicates that a finalizer or callback released a root, unpinned an object, or hit any of several othergcPoketriggers. This makes JSGC a little more aggressive all the time, especially by seeking out the extra memory released by finalizers. We will drop this feature for now. If/when we go to incremental MMgc, we can reimplement it, usingDWBto give us information equivalent togcPoke.
- (easy) In
js_UpdateMallocCounter- *- Unaffected.
JS_GCMETER,JSGCStats,js_DumpGCStats- *- Gone.
JSGCArenaList- *- Gone.
JSWeakRoots,JS_CLEAR_WEAK_ROOTS- *- These should be removed. /be
Other JSAPI support
These JSAPI functions involve GC features that aren't encapsulated behind jsgc.h.
JS_SetGCCallback(),JS_SetGCThingCallback()- * - MMgc support was added, bug 388011.JS_SetExtraGCRoots(),JS_MarkGCThing()- ** - MMgc support was added, bug 388970. These APIs are no longer documented.JS_CallTracer(),JS_TraceChildren,JS_TraceRuntime- * - These will continue to use the exactmark()methods that are built into everyJSClass. However, when the tracerIS_GC_MARKING_TRACER, this will now useMMgc::GC::SetMark()rather than the oldGCF_MARKbit to mark the object.