JavaScript:ActionMonkey:Stage 0 Whiteboard

From MozillaWiki
Jump to navigation Jump to search

Up: JavaScript:ActionMonkey.

Stage 0: Replace SpiderMonkey's GC (jsgc) with Tamarin's GC (MMgc).

bug 387012 – ActionMonkey stage 0 tracking bug

The plan is to hollow out jsgc.c and replace it with a new implementation based on MMgc. The API in jsgc.h will be almost entirely preserved; but see "Features we will lose" below.

We'll use MMgc in non-incremental mode to start with. (It's easier. Avoid premature optimization.)

We'll build SpiderMonkey as C++, and some key types (JSObject, JSString) will become classes. This is unavoidable because MMgc's API for finalization is to subclass MMgc::GCFinalizedObject.


Features we will lose

JS_THREADSAFE - We will lose the threading model where a single JSRuntime has multiple threads, each with its own JSContext. (Firefox doesn't use this.)

ad hoc marking in JSGC_MARK_END - SpiderMonkey embedders can ("for API compatibility", according to a comment in jsgc.c) call JS_CallTracer from a JSGCCallback when it receives the JSGC_MARK_END event. I think the idea is something similar to SetGCExtraRoots(). Firefox does not use this; we will not try to preserve it. It is likely to break.


.plan

  • bug 387938 – Modify MMgc to support ActionMonkey stage 0.
  • bug 387034 - Get the build system to build Tamarin and link JS against MMgc.
  • Get at least sketchy answers to most of the questions below.
  • Do a sprint with Mardak and Brendan (if he's available) to reimplement jsgc.cpp using MMgc.
  • Debug and grow wise.

General issues

If there's anywhere a GCThing has pointers to non-GC-managed memory that contains pointers back into GC-managed memory, we have a problem. Exact GC lets you do that (you know how to traverse the non-GC-managed memory). MMgc doesn't.

I don't know that SpiderMonkey does this anywhere. If it does, one fix is to change those data structures to live in GC-managed memory. So the fix itself is easy: finding the offenders is the interesting bit.

XPConnect garbage-collects XPCWrapNative objects and its own data structures. This needs some more research. (Brendan thinks this won't affect stage 0 work.)


Specific jsgc.h APIs

What follows is a dump of everything exposed through jsgc.h. Each item is rated * (one star, an easy exercise); ** (two stars, fun little puzzle); *** (three stars, hmmmm, that's interesting). The ratings are jorendorff's guesses though. The wildest, least-educated guesses are marked with ?.

  • GCX_OBJECT ... GCF_MUTABLE, js_GetGCThingFlags - **
    • Where do we put these flags? They are no longer needed for the GC itself, but non-GC-related functionality has been piggybacked on these flags, so we can't just get rid of them. GCF_MARK is not used outside the garbage collector, but GCF_MUTABLE, GCF_LOCK, and GCF_SYSTEM (and maybe the type bits) are used (and not just read access--there's code outside jsgc.c that actually twiddles these bits).
    • Try putting them in the revised JSObject, JSString, etc. data structures that subclass the MMgc GCObject types. /be
    • We can overallocate by a byte and put the flags bits alongside the object itself. That single byte costs us 8 bytes... but actually I'm leaning toward doing this, for now, silly as it is. It lets us leave the existing API alone, macros and all. We'll worry about all sorts of speed/space concerns in the next round.
    • Right, essentially the same my suggestion above. /be
    • GCF_LOCK flag is used to pin NaN, +-inf, and "" empty string, so perhaps create a GCRoot to point to these items. See js_LockGCThing below.
  • js_GetGCStringRuntime - *
    • GC::GetGC(const void *), then (a) decrement by some offset; or (b) just give the GC a pointer back to the runtime (that is, make a subclass of MMgc::GC that contains a pointer back to the JSRuntime, and use that).
    • There's only one JSRuntime in Firefox and other Gecko apps (used by XPConnect), so just make a singleton pointer associated with the GC instance. /be
  • GC_POKE - *
    • No-op for now. Its effect is read only by js_GC currently so since that function is gutted you can gut this use of rt->gcPoke. /be
    • The MMgc equivalent is DWB() and its ilk. In incremental mode, these are required. In non-incremental mode, they're only necessary if a finalizer might "resurrect" an object (that is, cause an unmarked object to become reachable). jsgc has never allowed this, so it's OK. -jto
  • js_ChangeExternalStringFinalizer - *
    • External strings can be implemented by a JSExternalString class with a destructor that consults the table of string finalizers.
    • Emulate this on top of MMgc using a virtual method on JSString, which inherits from GCFinalizedObject. /be
  • js_InitGC ... js_MapGCRoots - *
    • Core features of MMgc. There's no API in MMgc for enumerating roots, but we can either cheat (via #define private protected) or keep our own list outside of MMgc. Anyway--not hard.
    • Defer for now. /be
  • JSPtrTable, js_RegisterCloseableIterator, JSGCCloseState, js_RegisterGenerator, js_RunCloseHooks - *
    • These will become no-ops. All this has to do with iterator and generator cleanup, but these hooks are going away. See bug 380469. (Related bug: bug 349272.)
    • No-op'ing should not cause leaks since (AFAIK) no chrome JS uses generators. /be
  • JSGCThing - **
    • This can probably go away. It's mentioned outside jsgc.c in two places: (a) in the context of weakRoots.newborn, which I assume we'll keep, since the newborn guarantee is JSAPI-visible; and (b) in the declaration of JSContext, where they won't be needed anymore.
    • Both JSGCThing and cx->weakRoots should be removed. The latter should not be necessary given conservative stack scanning. /be
  • GC_NBYTES_MAX, GC_NUM_FREELISTS, GC_FREELIST_NBYTES, GC_FREELIST_INDEX - *
    • These can just go away.
  • js_NewGCThing - *
    • The only thing to worry about is the flags.
  • js_LockGCThing, js_LockGCThingRT, js_UnlockGCThingRT - **
    • This is the pinning API. Can be reimplemented on top of MMgc rooting. /be
    • "make one big root for everything you can keep track of - one GCRoot for each runtime." Reuse the "constants/pinning" GCRoot for anything else that needs locking?
    • See also: JS_LockGCThing API doc.
  • js_IsAboutToBeFinalized - *
    • We'll have to lay the hack on pretty thick to get this without modifying MMgc source code, but it can be done.
    • or modify MMgc -- that can be done too in this stage 0 work I think. /be
  • IS_GC_MARKING_TRACER - *
    • Unchanged. This is an undocumented feature of the trace API. js_GC will do an exact trace followed by a call to MMgc::GC::Collect(). During that trace, this macro returns true.
  • JSTRACE_FUNCTION ... JSTRACE_XML - *
    • Unaffected. The tracing API uses these.
  • JS_IS_VALID_TRACE_KIND - *
    • Unaffected.
  • js_CallValueTracerIfGCThing - *
    • Likely unaffected.
  • js_TraceStackFrame, js_TraceRuntime, js_TraceContext - *
    • Probably unaffected.
  • JSGCInvocationKind, GC_NORMAL, GC_LAST_CONTEXT, GC_LAST_DITCH, js_GC - **
    • The GC "invocation kinds" need to be maintained somehow. That will require some study.
    • If MMgc runs only a global mark and sweep in this stage 0 of ActionMonkey, then we can run out of memory (perhaps only after paging to death), and we do need to GC everything on last context destruction. So these should be kept as arguments to js_GC, and possibly even used in its new MMgc-based implementation. /be
    • Mode GC_LAST_DITCH is part of a mechanism to lock other threads out while collection is happening (possibly repeatedly); since we are losing the threading model, I don't think this does anything anymore.
    • The existing js_GC() can restart GC for any of three reasons:
      • (easy) In GC_LAST_CONTEXT mode, js_GC() simply collects repeatedly until no more garbage is collected. We will retain this behavior.
      • (subtler) js_GC takes the hint if a finalizer or GCCallback calls js_GC recursively. The recursive call just sets a flag, because GC is already underway; but after finalization, js_GC checks the flag and, if it's set, restarts GC almost from the beginning. We can probably easily retain this.
      • (subtler) js_GC also restarts in the same way if the gcPoke flag is set. (This flag indicates that a finalizer or callback released a root, unpinned an object, or hit any of several other gcPoke triggers. This makes JSGC a little more aggressive all the time, especially by seeking out the extra memory released by finalizers. We will drop this feature for now. If/when we go to incremental MMgc, we can reimplement it, using DWB to give us information equivalent to gcPoke.
  • js_UpdateMallocCounter - *
    • Unaffected.
  • JS_GCMETER, JSGCStats, js_DumpGCStats - *
    • Gone.
  • JSGCArenaList - *
    • Gone.
  • JSWeakRoots, JS_CLEAR_WEAK_ROOTS - *
    • These should be removed. /be


Other JSAPI support

These JSAPI functions involve GC features that aren't encapsulated behind jsgc.h.

  • JS_SetGCCallback(), JS_SetGCThingCallback() - * - MMgc support was added, bug 388011.
  • JS_SetExtraGCRoots(), JS_MarkGCThing() - ** - MMgc support was added, bug 388970. These APIs are no longer documented.
  • JS_CallTracer(), JS_TraceChildren, JS_TraceRuntime - * - These will continue to use the exact mark() methods that are built into every JSClass. However, when the tracer IS_GC_MARKING_TRACER, this will now use MMgc::GC::SetMark() rather than the old GCF_MARK bit to mark the object.