XLGC

XLgc stands for Cross Language garbage collection. It is an exact incremental GC framework with potential thread-safety. It is implemented in C and can be used to create a single pool of objects from different programming languages like JavaScript, Java and Python within one OS process.

MMgc is currently using colored states of managed object to determine , whether they are ready for collection. The states are only valid within one cycle, and all objects need to be accessed again to clear the state, before the next cycle can start. The idea is to replace ever changing colored states with an minimum distance to a root, which is permanently kept in sync with dependency list.

The framework uses a relational table to store links between objects. The table has two unique indexes: (source, target) and (target, source), and implemented as a digital search tree (DST) for each index. The first index allows to traverse the object graph at O (ln N) speed, the second one enables O (ln N) cost of determining minimum distance to a root of any given object.

XLgc uses a linked list for input queue. Linked list operations are a single point, where synchronization is required, if multiple threads are to generate GC events.

Concept comparison

Let's compare XPCOM refcount (XPCOM), the present MMgc idea (MMgc), and the proposed idea (XLgc):

Parameter \ Version	XPCOM	MMgc	XPgc
Inter-object Relations
how stored	ignored	saved in place	saved in a table
when changed	--	in a write barrier	in a write barrier
Object GC-status
mechanism	immediate refcount	white/grey/black	minimum distance to a root
when updated	in a write barrier	during mark stage	during mark stage
when applied	in a write barrier	during sweep stage	during mark stage

Discussion

MMgc doesn't store inter-object relations in a table, though. MMgc's DWB write barrier just queues the referent (the object being pointed at) for marking, if necessary. The queue doesn't track relations--it's just a bag of objects that need to be marked. --jorendorff 06:08, 15 October 2007 (PDT)

Now, it looks like MMgc scans the entire memory in search of collectable objects. --yanychar 09:14, 15 October 2007 (PDT)

It only scans reachable objects. --jorendorff 06:33, 16 October 2007 (PDT)

Details

The main idea of XLgc is to manage links, not objects.

Each link is as far from a root as its source is. Each object is as far from a root as a link with minimum distance for which it is a target. If there is no link to an object, the object is treated as root. It is supposed to be a stack (or global) object. When a link sees no root, link target is told to clean up.

Adding links

If a new link's source is root, the link gets an index of 1. Otherwise, the source index is increased by one.

If the link target index was greater than the link index, indexes of all links stemming from the target are reduced to the new value.

The last step is recursively repeated for every link, which index is reduced (Reset).

Dropping links

The link is removed from table (both indexes).

If the new index of the target becomes greater, all links stemming from the target and down the graph are recursively marked with MAXINT (Mar).

The last step doesn't calculate intermediate target indexes to avoid reference cycle traps.

The recursive part of the insertion process is run for the first target with an initial index of MAXINT/2. All marred links will be revisited, and every intermediate target, which still sees a root, will clear its subgraph of marred status.

Finally, the first target index is checked once again, and if its index is 0 or bigger than MAXINT/2, the target is told to clean up.

Multiple Inheritance

Object nodes should be identified by the beginning of their address space. This is guaranteed if we cast object to (void *) from one of its methods (Visitor design pattern). This must be done two times: for the parent and for the child.

It can be done like that:

nsISupports is changed

   -unsigned int AddRef();
   +void AddRef(void *parent, void *place);
   -unsigned int Release();
   +void Release(void *parent);
   +static void Free(void *self);

GC has something like this

   void xlgc_add_link(void *parent, void *child, void *place);
   void xlgc_drop_link(void *parent, void *child,
       void (*releaser)(void*), void *param);

parent calls:

   child->AddRef(this, &child);
   child->Release(this);

child calls:

 void
 nsFoo::AddRef(void *parent, void *place)
 {
   xlgc_add_link(parent, this, place);
 }
 void
 nsFoo::Release(void *parent)
 {
   xlgc_add_link(parent, this, &nsFoo::Free, this);
 }
 static void
 nsFoo::Free(void *self)
 {
   delete (nsFoo*) self;
 }

Breaking cycles

When a smart pointer is destructed, the underlying resource is released. In XPCOM, that implies a virtual function call on a stored pointer. When we want to break a reference cycle, the cycle last object destructor must be prevented from re-releasing the cycle first object. I found no smarter way than to set the value of the pointer to the first object in the last object to zero. This is similar to nsWeakReference mechanism.

Functional Comparison

Let's again compare XPCOM refcount (XPCOM), the present MMgc idea (MMgc), and the proposed idea (XPgc - Cross Platform gc):

Parameter \ Version	XPCOM	MMgc	XPgc
write barrier cost (add)	O(1)	O(1)	O(1)
write barrier cost (remove)	O(M), M = (changes)	O(1)	O(1)
mark stage cost	--	O(X), X = (bytes in RAM)	O(R ln Q), see discussion
sweep stage cost	--	O(X), X = (bytes in RAM)	--
sweep misses	leaks on cycle	conservative leaks	--

This table shows that three solutions are not worse or better, they are different. Refcount is great for design which is free of cycles, MMgc and XLgc allows to postpone processing, and XLgc is considerably better when only a subset changes states between collection cycles.

In addition, XLgc allows to do collection incrementally without additional cost.

Discussion

The chart may be misleading, because everything depends on the constant factor of the tasks that are O(1). Write barriers are executed O(N)-ish times per cycle, so the total performance of your scheme is really O(N), not O(M log N)... unless M is some large fraction of N, in which case it's O(N log N), which would be really bad. Let me know what I'm missing. :) --jorendorff 06:22, 15 October 2007 (PDT)

Write barrier code can in the order of 20 machine instructions: adding a link to the queue using pool of pre-allocated link objects. So it is really a O(1). If R objects pass write barrier, this part of overhead is O(R). M is meant to be the number of links, that change their state during collection, and N is the total number of links in all managed objects, X is the size of RAM. Obviously, R <= M <= N <= X, and it is likely that R <= M << N <= X (F.e., according to generational hypothesis).

O(M ln N) is a worst case estimate of the number of instructions of M searches in an efficient data structure with N elements. It is 'worst case, because we will be searching for ranges in the index, so N can be expected as O(sqrt N). But that doesn't change O(M ln N), anyway. --yanychar 00:43, 16 October 2007 (PDT)

Further optimization may allow to avoid collecting young object that come and go between cycles for the additional O(R ln Q), Q = (queue) collection costs by indexing collector queue. This brings N << M (where << is .02-.06 according to generational hypothesis). Since we retain complete control of when collection happens, we can keep the queue large enough to ensure <<. This lets me claim that probably O(R) is the total cost of collection. --yanychar 01:31, 16 October 2007 (PDT)

Updated with R = (required objects), made stronger claim about total cost. --yanychar 06:58, 16 October 2007 (PDT)

Comments

The framework is currently implemented at a proof-of-concept level. It correctly handles reference graphs with cycles.

I spent some time trying to wire it up to XPCOM and SpiderMonkey, but with little success. Instead, I used a test with mozilla-like classes and smart pointers.