GC API: Difference between revisions

11,924 bytes added ,  23 April 2008
callbacks need closures, duh
(note meeting)
(callbacks need closures, duh)
 
(37 intermediate revisions by 3 users not shown)
Line 5: Line 5:
—[[User:Jorend|jorendorff]] 11:11, 28 March 2008 (PDT)
—[[User:Jorend|jorendorff]] 11:11, 28 March 2008 (PDT)


If you want to listen in on some design discussions, lurk in the [irc://irc.mozilla.org/memory #memory] channel on irc.mozilla.org.  Next meeting: Monday, March 31, 2008, 11:00 AM Pacific time.
If you want to listen in on some design discussions, lurk in the [irc://irc.mozilla.org/memory #memory] channel on irc.mozilla.org.  Next meeting: Friday, April 11, 2008, 9:00 AM Pacific time.


=Decisions=
=Decisions=
Line 15: Line 15:
We will support C++ objects somehow, as well as something like spam's "onion" layout.
We will support C++ objects somehow, as well as something like spam's "onion" layout.


We won't bend over backwards to make the API support [[mdc:MMgc|MMgc]] in the fastest or most natural way.  For example, the fastest possible MMgc write barrier requires the caller to know the beginning-of-allocation pointers for both the object containing the field being written and the pointer being stored there.  This is a pain for the user, and a card-marking GC doesn't need those pointers anyway.  So our API will *not* provide this information; the MMgc wrapper will have to calculate it each time the write barrier hits.  This will make MMgc look slow, but we can live with that.
We won't bend over backwards to make the API support [[mdc:MMgc|MMgc]] in the fastest or most natural way.  For example, the fastest possible MMgc write barrier requires the caller to know the beginning-of-allocation pointers for both the object containing the field being written and the pointer being stored there.  This is a pain for the user, and a card-marking GC doesn't need those pointers anyway.  So our API will ''not'' provide this information; the MMgc wrapper will have to calculate it each time the write barrier hits.  This will make MMgc look slow, but we can live with that.
 
The purpose of this API is to support switching to a different GC implementation ''at compile time'', not at run time.  (Each implementation will have its own header file, with its own inline functions or function-like macros.  Maybe we'll produce a GC API header file that has no inline anything, just showing the signatures the implementation must provide, for reference.)
 
We will not support any kind of reference counting until we hit the use case that compels us to implement it.  (Jason Evans in particular wants to try other optimizations first.)
 
'''Open issue:''' API style.  (C or C++?  Use templates?  What naming conventions?  I think it's too early to ask these questions.  So please consider all the function signatures below to be pseudocode.)
 
'''Open issue:''' We should detect dumb implementation mismatches at link time and bomb out.  I don't know the trick, but I bet bsmedberg does.
 
'''Open issue:''' The JSAPI allows JSObjects to contain pointers to non-GC-allocated data which may contain pointers back to GC-allocated stuff (objects, strings, numbers).  See [[mdc:JSClass.mark]].  This existing API seems impossible to reimplement on top of the GC API as it stands.  First, the GC API doesn't offer a per-object custom marking hook.  Second, the GC API insists on a write barrier (the JSAPI doesn't).
 
'''Open issue:''' Need to document which operations require a request.


=API areas=
=API areas=


== Heap setup/teardown ==
== Heaps ==
Requirement: Support multiple heaps.  Each heap is a disjoint universe of objects.  We don't need to trace pointers across heaps.
We support multiple heaps; each heap is a disjoint universe of objects and can be collected separatelyImplementations are expected not to trace pointers across heaps.


==Allocation==
  typedef ... '''GCHeap''';
 
  void * '''gc_alloc_no_pointers'''(gcheap_t heap, size_t size);
   
   
  void * '''gc_alloc_contains_pointers'''(gcheap_t heap, size_t size, gclayout_t layout);
  GCHeap '''gc_heap_create'''();
   
   
  void * '''gc_alloc_conservative'''(gcheap_t heap, size_t size);
  void '''gc_destroy_heap'''(GCHeap heap);


==Allocation==
(BTW we haven't established naming conventions; this is just a sketch)
(BTW we haven't established naming conventions; this is just a sketch)


The idea here is to support allocations of different types:  <code>no_pointers</code> allocations are leaf objects and never contain pointers at all; <code>contains_pointers</code> allocations have a known layout that governs which fields are pointers; and <code>conservative</code> allocations might contain pointers at any pointer-aligned offset and must be conservatively scanned.
The idea here is to support allocations of different types:  <code>no_pointers</code> allocations are leaf objects and never contain pointers at all; <code>with_layout</code> and <code>array_with_layout</code> allocations have a known layout that governs which fields are pointers; and <code>conservative</code> allocations might contain pointers at any pointer-aligned offset and must be conservatively scanned.
 
(Where the layout information isn't a speed win, the implementation can of course just discard it.  A hacky implementation can just delegate <code>gc_alloc_with_layout</code> and <code>gc_alloc_array_with_layout</code> to <code>gc_alloc_conservative</code>.  Sloppy, but fine by me.)
 
XXXbsmedberg: I think this incorrect. At least if layout information specifies that a word is not a GC pointer, we should reliably not trace that word.
 
All these functions return a pointer to a newly allocated region of memory that is subject to GC (that is, the GC may collect it when it becomes unreachable), or <code>NULL</code> on failure. XXXbsmedberg: the OOM API probably requires either that allocation functions never fail, or that there is a variant of these functions that never fail.
 
All allocations are <code>malloc</code>-aligned (that is, alignment is such that the pointer can be cast to any reasonable C/C++ type and used). They must be at least 8-byte aligned, so that three bits of tag are available.
 
typedef enum GCAllocFlags {
    '''GC_HAS_FINALIZER'''
} '''GCAllocFlags''';
 
The ''flags'' parameter to the allocation functions is a bitwise OR of <code>GCAllocFlags</code> constants.  Currently we only have one: <code>GC_HAS_FINALIZER</code> indicates that the allocation starts with a C++ object of a type that has a virtual destructor.  The GC will call the destructor at some point after the object becomes unreachable and before its memory is reused or the heap is destroyed.  This is the only API for per-object finalizers.
 
void * '''gc_alloc_no_pointers'''(
    GCHeap heap, size_t size, int flags);
 
Allocate ''size'' bytes of memory, not necessarily zeroed.  The GC will assume the memory never contains pointers to other GC allocations.
 
void * '''gc_alloc_conservative'''(
    GCHeap heap, size_t size, int flags);
 
Allocate and zero out ''size'' bytes of memory.  The GC will conservatively scan the memory for pointers.
 
void * '''gc_alloc_onion'''(
    GCHeap heap, size_t rawSize, size_t ptrsSize, int flags);
 
Allocate ''rawSize'' + ''ptrsSize'' bytes of memory.  Both sizes must be pointer-size-aligned.  The first ''rawSize'' bytes are not necessarily zeroed.  They GC will assume that they never contain pointers.  The next ''ptrsSize'' bytes are treated as an array of "possible pointers" (<code>jsval</code>, really; see below).
 
''flags'' must not contain <code>GC_HAS_FINALIZER</code>.
 
Returns a pointer to the first byte of the pointers region, so the non-pointer-containing words are at negative offsets and the pointer-containing words are at positive offsets.
 
void * '''gc_alloc_with_layout'''(
    GCHeap heap, GCLayout layout, int flags);
 
Allocate and zero out memory for an object.  The ''layout'' specifies the size of the allocation and which words of it may contain pointers. Layout is described in more detail below.
 
void * '''gc_alloc_array_with_layout'''(
    GCHeap heap,
    GCLayout headerLayout,
    GCLayout elementLayout,
    size_t count,
    int flags);
 
Allocate and zero out memory for an array.  The layout of the array consists of an optional header described by ''headerLayout'', followed by ''count'' array elements, each described by ''elementLayout''.  Returns a pointer to the beginning of the allocation (the header).
 
'''Open issue:''' Weak references.  (At the moment I'm leaning toward just providing an end-of-mark-phase GC callback and a IsAboutToBeCollected API and letting users build weakrefs on top of this.)
 
'''Open issue:''' Weak caches.  For example, there's a cache in Mozilla where we have Key and Value objects, both of which will become GC-allocated, and the map needs to have this behavior:
 
# Entries do not keep Values alive.
# As long as the Value is reachable, keep the entry (keeping the Key alive).
# Whenever a Value is collected, remove all corresponding entries (the Key can then be collected, preferably in the same GC cycle).
# The map must not keep cyclic garbage alive.
 
We could do it without special GC API support if we're willing to add a pointer field on ''every'' XPCOM object ''AND'' we have something resembling weak references.  Even that might be slow.
 
Another case has both keys and values keeping each other (and the relevant entries) alive.
 
'''Figure 1:  nonworking straw man solution to this problem'''
[[Image: weakcachegraph1.png]]
 
This has properties 1-3 but not property 4; if there is a cycle between Key and Value (which can easily happen, given XPCOM and XPConnect) then the map keeps it alive.
 
'''Figure 2:  working straw man solution'''
[[Image:weakcachegraph2.png]]
 
This has all four properties but costs at least one pointer field per Value.
 
== Explicit deallocation ==


(Where the layout information isn't a speed win, the implementation can of course just discard it.  A hacky implementation can just delegate <code>gc_alloc_contains_pointers</code> to <code>gc_alloc_conservative</code>.  Sloppy, but fine by me.)
void '''gc_free'''(GCHeap heap, void *ptr);


'''Open issue:''' Do we need <code>gc_alloc_conservative</code>?
Frees the allocation at ''ptr'', which must be a pointer returned by an allocation function.  The region of memory at ''ptr'' must be live, but the caller must also ensure that no live, exactly-scanned words point to it.  (Conservatively scanned words, including C/C++ temporaries and local variables, may point to it.)


'''Open issue:''' We don't know exactly what the layout information is yet.  Speculation is, for C++ objects it's a bitmap with one or two bits per word of the object, and for onions it gives the number of contains-pointers words in the onion.  We can, using some kind of static header-preprocessing magic ([[Dehydra]]), autogenerate layout information for C++ classes.
This may be a no-op.


'''Open issue:''' How to handle types like <code>[[mdc:jsval|jsval]]</code>, which contains tag bits indicating whether it's a pointer or not.  (And in tamarin-tracing, there's a 64-bit type in which bits in the high 32 bits tell you whether the low 32 bits are a pointer or not!)
== Layout ==


'''Open issue:''' Reference counting (or DRC) and/or other nasty tricks to keep memory usage down.
typedef ... '''GCLayout''';
GCLayout '''gc_create_layout'''(
    GCHeap heap, size_t size, unsigned char *bitmap);


'''Open issue:''' Finalizers.
Create a layout object which represents the layout of pointers within an object type.


'''Open issue:''' Weak references.
''bitmap'' is an array of bytes. Each byte represents one word of the target object (there must be <code>''size''/sizeof(void *)</code> entries)
 
:0 - the word is not a pointer
:1 - the word is a pointer, perhaps an interior pointer, and perhaps a pointer to an rcobject
:2 - the word is a jsval
:3 - the word is the possibly-pointer-containing half of a ttbox
 
3 only makes sense on 32-bit platforms for now.
 
'''Open issue:''' GCLayout lifetime - gc'd? or lifetime of heap?


== Write barrier ==
== Write barrier ==
Line 62: Line 157:


== GC ==
== GC ==
gc / maybe_gc / do_incremental_gc


'''Open issue:''' Hooks into the GC cycle.
void '''gc_collect'''();
 
Unconditionally collect garbage now. The current thread must be in a request.
 
void '''gc_maybe_collect'''(int msecs);
 
Suggest to the Garbage collector API that now might be a good time to collect garbage. The GC may decide to begin or continue incremental garbage collection during this callback. <var>msecs</var> is an application hint to the garbage collector indicating how many milliseconds incremental marking should be allowed to consume. There is no guarantee about the actual time consumed by the function.
 
typedef enum gc_GCStatus {
  GC_ROOTING,
  GC_LAST_ROOTING,
  GC_PRE_SWEEP,
  GC_POST_SWEEP
} gc_GCStatus; 
 
; GC_ROOTING
: The callback function may programmatically "root" objects by explicitly marking objects (via <tt>gc_mark_object</tt>). Note that application code may re-enter after this callback, if incremental GC is being performed.
; GC_LAST_ROOTING
: Like the GC_ROOTING callback, the callback function  may programmatically "root" objects, but client code will not run before sweeping.
; GC_PRE_SWEEP
: At this point all marking has occurred. The callback function may synchronize external data structures by checking <tt>gc_get_markstate</tt>
; GC_POST_SWEEP
: At this point all sweeping has occurred, and the program is about to be resumed. Threads other than the main thread have not yet been restarted.
; GC_FINISHED
: At this point garbage collection is finished and threads have been resumed. Garbage collection will not occur again until this callback is complete. ''See {{bug|430290}} for rationale.''
 
typedef void (*gc_callback)(
  gc_GCStatus state, void *closure);
 
void '''gc_add_callback'''(gc_callback callback, void *closure);
 
Register a callback function. If '''gc_set_thread_affinity''' has been called, the callback will occur on the specified thread.
 
'''Open issue:''' Need to document which callbacks may call which GC API functions.
 
The next two issues can only be resolved by taking a good hard look at SpiderMonkey internals.
 
'''Open issue:''' We may need to add callbacks for entering and leaving stop-the-world mode (what the MMGC_THREADSAFE comments call "exclusiveGC").  These are distinct from the pre-sweep and post-sweep callbacks, which only fire when a GC cycle ends; incremental marking stops the world too, but shouldn't fire those.
 
'''Open issue:''' We may need to expose the GC lock somehow.  SpiderMonkey currently uses it two ways: uses it as a general-purpose mutex (dubious); and creates condition variables protected by it (not quite as dubious, but still).


== Rooting ==
== Rooting ==
add root / remove root
The rooting API provides a simple way to treat a particular GC object as a root. More complex rooting scenarios can be accomplished with a precollect hook.
 
typedef struct GCRoot GCRoot; /* opaque */
GCRoot* '''gc_root_object'''(
  void *gcobject);
 
Treat gcobject as a root. <var>gcobject</var> must have been allocated with a GC allocation function.
 
void '''gc_remove_root'''(
  GCRoot *root);
 
== Multithreading ==
Each thread must indicate when it enters/leaves a region of code that touches GC-managed memory (and therefore needs GC to happen only when it's at a safe point) and when it enters/leaves a region of code that doesn't touch GC-managed memory at all (basically one long safe point, where the thread doesn't care if GC happens or not).
 
For a single-threaded program with only one <code>GCHeap</code>, this just means calling <code>gc_begin_request(heap)</code> at startup and <code>gc_end_request(heap)</code> at shutdown.


== Synchronization/concurrency ==
Features:
The request model, or something.


'''Open issue:''' This whole area.
void '''gc_begin_request'''(GCHeap heap);


('''jorendorff note:''' SpiderMonkey has some pretty awesome hacks in the gc synchronization code, requiring equally awesome hacking in ActionMonkey's branch of MMgcMaybe we could better divide the responsibilitiesDiscuss.)
Enter a request.
 
The calling thread must not be in any active requests on any heap.
 
void '''gc_end_request'''(GCHeap heap);
 
Leave the current request.
 
The calling thread must be in an active request on <code>heap</code>.
 
void '''gc_suspend_request'''(GCHeap heap);
 
Suspend the current request.
 
The calling thread must be in an active request on <code>heap</code>.  That request becomes inactive.
 
The calling thread must later call <code>gc_resume_request</code>.
 
Allocations pointed to by C/C++ local variables in the caller or any of its callers at the time of the call to <code>gc_suspend_request</code> will remain reachable until the matching <code>gc_resume_request</code> call.  (That is, they are temporarily rooted.)
 
void '''gc_resume_request'''(GCHeap heap);
 
Resume a suspended request.
 
The calling thread must not be in an active request on any <code>GCHeap</code>.
 
The most recently suspended inactive request that the calling thread is in on <code>heap</code> becomes active.
 
#define '''GC_FAST_SUSPEND_REQUEST'''(heap) ...
   
#define '''GC_FAST_RESUME_REQUEST'''(heap) ...
 
These are macros such that this code:
<pre style="border: none; padding: none; background-color: transparent">GC_FAST_SUSPEND_REQUEST(expr);
&lt;statements&gt;
GC_FAST_RESUME_REQUEST(expr);</pre>
 
expands to a C/C++ statement that behaves like this one:
<pre style="border: none; padding: none; background-color: transparent">{
    gc_suspend_request(heap);
    &lt;statements&gt;
    gc_resume_request(heap);
}</pre>
 
except that:
* in C++, <code>gc_resume_request</code> must be called whenever control exits the block, even if it exits via an exception, <code>return</code>, <code>break</code>, <code>continue</code>, or <code>goto</code>; and
* the behavior is undefined if the ''&lt;statements&gt;'' contain any identifier starting with <code>_gc_</code>.
 
If either macro is used any other way, the result is undefined.
 
  void '''gc_yield_request'''(GCHeap heap);
 
Equivalent to <code>{gc_suspend_request(heap); gc_resume_request(heap);}</code>.
 
void '''gc_set_thread_affinity'''();
 
Inform the GC that all finalizers and callback functions should be called on the current thread.


== Tracing ==
== Tracing ==
Some SpiderMonkey 1.8 JSAPI features, yet to be documented, need this.  See [[mdc:JSAPI Reference#Memory_management]].
Some SpiderMonkey 1.8 JSAPI features, yet to be documented, offer generic object graph walking.  See [[mdc:JSAPI Reference#Memory_management]].
 
bsmedberg says: most of the spidermonkey tracing APIs were designed for the cycle collector. I would dearly love to flush them down the drain once XPCOMGC lands.
 
'''Open issue:''' Whether we can discard the SpiderMonkey 1.8 tracing API.
 
= The GC Internals API =
 
Mozilla's implementation of the GC API will consist of an allocator and a garbage collector.  The allocator will implement many of the above features ...plus a mostly-private API for the use of the garbage collector.
 
This mostly-private API is being designed at [[GC Internals API]].
Confirmed users, Bureaucrats and Sysops emeriti
1,217

edits