GC API: Difference between revisions

← Older edit

GC API (view source)

Revision as of 13:35, 23 April 2008

6,212 bytes added , 23 April 2008

callbacks need closures, duh

Benjamin Smedberg

Confirmed users, Bureaucrats and Sysops emeriti

1,217

edits

@@ Line 24: / Line 24: @@
 '''Open issue:''' We should detect dumb implementation mismatches at link time and bomb out.  I don't know the trick, but I bet bsmedberg does.
+'''Open issue:''' The JSAPI allows JSObjects to contain pointers to non-GC-allocated data which may contain pointers back to GC-allocated stuff (objects, strings, numbers).  See [[mdc:JSClass.mark]].  This existing API seems impossible to reimplement on top of the GC API as it stands.  First, the GC API doesn't offer a per-object custom marking hook.  Second, the GC API insists on a write barrier (the JSAPI doesn't).
+'''Open issue:''' Need to document which operations require a request.
 =API areas=
@@ Line 37: / Line 41: @@
 ==Allocation==
 (BTW we haven't established naming conventions; this is just a sketch)
@@ Line 44: / Line 47: @@
 (Where the layout information isn't a speed win, the implementation can of course just discard it.  A hacky implementation can just delegate <code>gc_alloc_with_layout</code> and <code>gc_alloc_array_with_layout</code> to <code>gc_alloc_conservative</code>.  Sloppy, but fine by me.)
+XXXbsmedberg: I think this incorrect. At least if layout information specifies that a word is not a GC pointer, we should reliably not trace that word.
-All these functions return a pointer to a newly allocated region of memory that is subject to GC (that is, the GC may collect it when it becomes unreachable), or <code>NULL</code> on failure.
+All these functions return a pointer to a newly allocated region of memory that is subject to GC (that is, the GC may collect it when it becomes unreachable), or <code>NULL</code> on failure. XXXbsmedberg: the OOM API probably requires either that allocation functions never fail, or that there is a variant of these functions that never fail.
-All allocations are <code>malloc</code>-aligned (that is, alignment is such that the pointer can be cast to any reasonable C/C++ type and used).
+All allocations are <code>malloc</code>-aligned (that is, alignment is such that the pointer can be cast to any reasonable C/C++ type and used). They must be at least 8-byte aligned, so that three bits of tag are available.
   typedef enum GCAllocFlags {
@@ Line 153: / Line 157: @@
 == GC ==
-gc / maybe_gc / do_incremental_gc
-'''Open issue:''' Hooks into the GC cycle.
+ void '''gc_collect'''();
+Unconditionally collect garbage now. The current thread must be in a request.
+ void '''gc_maybe_collect'''(int msecs);
+Suggest to the Garbage collector API that now might be a good time to collect garbage. The GC may decide to begin or continue incremental garbage collection during this callback. <var>msecs</var> is an application hint to the garbage collector indicating how many milliseconds incremental marking should be allowed to consume. There is no guarantee about the actual time consumed by the function.
+ typedef enum gc_GCStatus {
+   GC_ROOTING,
+   GC_LAST_ROOTING,
+   GC_PRE_SWEEP,
+   GC_POST_SWEEP
+ } gc_GCStatus;
+; GC_ROOTING
+: The callback function may programmatically "root" objects by explicitly marking objects (via <tt>gc_mark_object</tt>). Note that application code may re-enter after this callback, if incremental GC is being performed.
+; GC_LAST_ROOTING
+: Like the GC_ROOTING callback, the callback function  may programmatically "root" objects, but client code will not run before sweeping.
+; GC_PRE_SWEEP
+: At this point all marking has occurred. The callback function may synchronize external data structures by checking <tt>gc_get_markstate</tt>
+; GC_POST_SWEEP
+: At this point all sweeping has occurred, and the program is about to be resumed. Threads other than the main thread have not yet been restarted.
+; GC_FINISHED
+: At this point garbage collection is finished and threads have been resumed. Garbage collection will not occur again until this callback is complete. ''See {{bug|430290}} for rationale.''
+ typedef void (*gc_callback)(
+   gc_GCStatus state, void *closure);
+ void '''gc_add_callback'''(gc_callback callback, void *closure);
+Register a callback function. If '''gc_set_thread_affinity''' has been called, the callback will occur on the specified thread.
+'''Open issue:''' Need to document which callbacks may call which GC API functions.
+The next two issues can only be resolved by taking a good hard look at SpiderMonkey internals.
+'''Open issue:''' We may need to add callbacks for entering and leaving stop-the-world mode (what the MMGC_THREADSAFE comments call "exclusiveGC").  These are distinct from the pre-sweep and post-sweep callbacks, which only fire when a GC cycle ends; incremental marking stops the world too, but shouldn't fire those.
+'''Open issue:''' We may need to expose the GC lock somehow.  SpiderMonkey currently uses it two ways: uses it as a general-purpose mutex (dubious); and creates condition variables protected by it (not quite as dubious, but still).
 == Rooting ==
-add root / remove root
+The rooting API provides a simple way to treat a particular GC object as a root. More complex rooting scenarios can be accomplished with a precollect hook.
+ typedef struct GCRoot GCRoot; /* opaque */
+ GCRoot* '''gc_root_object'''(
+   void *gcobject);
+Treat gcobject as a root. <var>gcobject</var> must have been allocated with a GC allocation function.
+ void '''gc_remove_root'''(
+   GCRoot *root);
+== Multithreading ==
+Each thread must indicate when it enters/leaves a region of code that touches GC-managed memory (and therefore needs GC to happen only when it's at a safe point) and when it enters/leaves a region of code that doesn't touch GC-managed memory at all (basically one long safe point, where the thread doesn't care if GC happens or not).
+For a single-threaded program with only one <code>GCHeap</code>, this just means calling <code>gc_begin_request(heap)</code> at startup and <code>gc_end_request(heap)</code> at shutdown.
+Features:
+ void '''gc_begin_request'''(GCHeap heap);
+Enter a request.
+The calling thread must not be in any active requests on any heap.
+ void '''gc_end_request'''(GCHeap heap);
+Leave the current request.
+The calling thread must be in an active request on <code>heap</code>.
+ void '''gc_suspend_request'''(GCHeap heap);
+Suspend the current request.
+The calling thread must be in an active request on <code>heap</code>.  That request becomes inactive.
+The calling thread must later call <code>gc_resume_request</code>.
+Allocations pointed to by C/C++ local variables in the caller or any of its callers at the time of the call to <code>gc_suspend_request</code> will remain reachable until the matching <code>gc_resume_request</code> call.  (That is, they are temporarily rooted.)
+ void '''gc_resume_request'''(GCHeap heap);
+Resume a suspended request.
+The calling thread must not be in an active request on any <code>GCHeap</code>.
+The most recently suspended inactive request that the calling thread is in on <code>heap</code> becomes active.
+ #define '''GC_FAST_SUSPEND_REQUEST'''(heap) ...
+ #define '''GC_FAST_RESUME_REQUEST'''(heap) ...
+These are macros such that this code:
+<pre style="border: none; padding: none; background-color: transparent">GC_FAST_SUSPEND_REQUEST(expr);
+&lt;statements&gt;
+GC_FAST_RESUME_REQUEST(expr);</pre>
+expands to a C/C++ statement that behaves like this one:
+<pre style="border: none; padding: none; background-color: transparent">{
+    gc_suspend_request(heap);
+    &lt;statements&gt;
+    gc_resume_request(heap);
+}</pre>
+except that:
+* in C++, <code>gc_resume_request</code> must be called whenever control exits the block, even if it exits via an exception, <code>return</code>, <code>break</code>, <code>continue</code>, or <code>goto</code>; and
+* the behavior is undefined if the ''&lt;statements&gt;'' contain any identifier starting with <code>_gc_</code>.
+If either macro is used any other way, the result is undefined.
+ void '''gc_yield_request'''(GCHeap heap);
-== Synchronization/concurrency ==
+Equivalent to <code>{gc_suspend_request(heap); gc_resume_request(heap);}</code>.
-The request model, or something.
-'''Open issue:''' This whole area.
+ void '''gc_set_thread_affinity'''();
-('''jorendorff note:''' SpiderMonkey has some pretty awesome hacks in the gc synchronization code, requiring equally awesome hacking in ActionMonkey's branch of MMgc.  Maybe we could better divide the responsibilities.  Discuss.)
+Inform the GC that all finalizers and callback functions should be called on the current thread.
 == Tracing ==