XLGC

From MozillaWiki
Jump to: navigation, search

XLgc stands for Cross Language garbage collection. It is an exact incremental GC framework with potential thread-safety. It is implemented in C and can be used to create a single pool of objects from different programming languages like JavaScript, Java and Python within one OS process.

MMgc is currently using colored states of managed object to determine , whether they are ready for collection. The states are only valid within one cycle, and all objects need to be accessed again to clear the state, before the next cycle can start. The idea is to replace ever changing colored states with an minimum distance to a root, which is permanently kept in sync with dependency list.

The framework uses a relational table to store links between objects. The table has two unique indexes: (source, target) and (target, source), and implemented as a digital search tree (DST) for each index. The first index allows to traverse the object graph at O (ln N) speed, the second one enables O (ln N) cost of determining minimum distance to a root of any given object.

XLgc uses a linked list for input queue. Linked list operations are a single point, where synchronization is required, if multiple threads are to generate GC events.

Its exact nature and decent performance also make XLgc a candidate for the single place to store frames/DOM/boxes hierarchies of objects.

Concept comparison

Let's compare XPCOM refcount (XPCOM), the present MMgc idea (MMgc), and the proposed idea (XLgc):

Parameter \ VersionXPCOMMMgcXLgc
Inter-object Relations
how stored ignored saved in place saved in a table
when changed -- in a write barrier in a write barrier
Object GC-status
mechanism immediate refcount white/grey/black minimum distance to a root
when updated in a write barrier during mark stage during mark stage
when applied in a write barrier during sweep stage during mark stage

Details

The main idea of XLgc is to manage links, not objects.

Each link is as far from a root as its source is. Each object is as far from a root as a link with minimum distance for which it is a target. If there is no link to an object, the object is treated as root. It is supposed to be a stack (or global) object. When a link sees no root, link target is told to clean up.

Adding links

If a new link's source is root, the link gets an index of 1. Otherwise, the source index is increased by one.

If the link target index was greater than the link index, indexes of all links stemming from the target are reduced to the new value.

The last step is recursively repeated for every link, which index is reduced (Reset).

Dropping links

The link is removed from table (both indexes).

If the new index of the target becomes greater, all links stemming from the target and down the graph are recursively marked with MAXINT (Mar).

The last step doesn't calculate intermediate target indexes to avoid reference cycle traps.

The recursive part of the insertion process is run for the first target with an initial index of MAXINT/2. All marred links will be revisited, and every intermediate target, which still sees a root, will clear its subgraph of marred status.

Finally, the first target index is checked once again, and if its index is 0 or bigger than MAXINT/2, the target is told to clean up.

Multiple Inheritance

Object nodes should be identified by the beginning of their address space. This is guaranteed if we cast object to (void *) from one of its methods (Visitor design pattern). This must be done two times: for the parent and for the child.

It can be done like that:

  • nsISupports is changed
   -unsigned int AddRef();
   +void AddRef(void *parent, void *place);
   -unsigned int Release();
   +void Release(void *parent);
   +static void Free(void *self);
  • GC has something like this
   void xlgc_add_link(void *parent, void *child, void *place);
   void xlgc_drop_link(void *parent, void *child,
       void (*releaser)(void*), void *param);
  • parent calls:
   child->AddRef(this, &child);
   child->Release(this);
  • child calls:
 void
 nsFoo::AddRef(void *parent, void *place)
 {
   xlgc_add_link(parent, this, place);
 }
 void
 nsFoo::Release(void *parent)
 {
   xlgc_add_link(parent, this, &nsFoo::Free, this);
 }
 static void
 nsFoo::Free(void *self)
 {
   delete (nsFoo*) self;
 }

Breaking cycles

When a smart pointer is destructed, the underlying resource is released. In XPCOM, that implies a virtual function call on a stored pointer. When we want to break a reference cycle, the cycle last object destructor must be prevented from re-releasing the cycle first object. I found no smarter way than to set the value of the pointer to the first object in the last object to zero. This is similar to nsWeakReference mechanism.

Functional Comparison

Let's again compare XPCOM refcount (XPCOM), the present MMgc idea (MMgc), and the proposed idea (XPgc - Cross Platform gc):

Parameter \ VersionXPCOMMMgcXPgc
total write barrier cost (add) O(R), R = (hit write barriers) O(R) O(R)
total write barrier cost (remove) O(M), M = (changes) O(R) O(R)
mark stage cost -- O(X), X = (managed RAM) O(M ln N), N = (total links)
sweep stage cost -- O(X) --
sweep misses leaks on cycle conservative leaks --

This table shows that three solutions are not worse or better, they are different. Refcount is great for design which is free of cycles, MMgc and XLgc allows to postpone processing, and XLgc is considerably better when only a subset changes states between collection cycles. In addition, XLgc allows to do collection incrementally without additional cost.

Further optimization

Write barrier speed

XLgc write barriers doesn't contain conditional operators, but allocates a structure on the heap. If this turns out to be a performance handicap, allocation/deallocation can be replaced with insertion/removal from a linked list of pre-allocated objects. And pre-allocation can be moved from write barrier to mark stage.

Input queue optimization

According to generational hypothesis, figures in the performance chart have the following relations: R <= M << N <= X, where << is in 0.02-0.06 range. XLgc can bank in on this relation, if it manages to coordinate GC runs with stack frames, so that stack smart pointers release the objects they hold between GC runs.

Long-living objects with a deep hierarchy of children are best stored in some global variables, so that assigning/releasing from stack pointers doesn't trigger wide-scale depth recalculations.

Reducing memory usage

As it is now, each active link in a smart pointer costs 40 bytes on 32bit systems: 2x4 in a pointer, and 8x4 in the framework. 8 (4+4) bytes serve weak-reference functionality, and may probably be optimized out by using information from allocator to determine beginnings of source objects.

In addition, if we may assume that all pointers are 4 (or even 8) byte aligned, there will be 6x2 (or even 6x3) unused bits per structure. This space may be used to store type of data, to which the target points. Knowledge of data type opens a possibility of using GC as a primary storage of links between elements of frames/DOM/boxes hierarchies, thus reducing overhead to zero for this types of data.

Comments

The framework is currently implemented at a proof-of-concept level. It correctly handles reference graphs with cycles.

I spent some time trying to wire it up to XPCOM and SpiderMonkey, but with little success. Instead, I used a test with mozilla-like classes and smart pointers.