XPCOMGC

From MozillaWiki
Jump to: navigation, search

XPCOMGC is the Mozilla 2 project to convert the XPCOM object model from reference counting to use garbage collection, and unify with the JS engine memory management.

General Info

Tracking is bug XPCOMGC.

Rationale

We need to be able to free cycles of XPCOM objects. Mozilla 1.9 introduced a cycle collector to do just that. We've already removed hacks to avoid creating reference cycles, so we're now dependent on this.

The cycle collector should be replaced with a real GC. The cycle collector is complex. It requires cooperation from the objects that might appear in cycles. It interacts with the JS garbage collector in a way that verges on black magic.

A single true GC covering both XPCOM objects and JS is a much more direct approach. It will not make it much easier to debug memory leaks or crashes, but the GC itself will be faster and easier to maintain, and client code will be simpler.

The big advantage of the cycle collector was a software engineering win: it did not touch any of the reference-counting, the addrefs and releases in XPCOM client code. Now that we have better static analysis and rewriting tools, we can consider true GC.

We can't use a Java-style copying GC. The fastest GCs are generational copying GCs. But a fully copying GC is unsuitable for C++. When GC happens, if any pointers are in registers or stack locations that the GC doesn't know about, the objects they point to must not be moved. The only solutions are:

  • Get information from the compiler about stack locations. (This is what Java does; the information is supplied by the JIT. It's basically impossible to do this for C++.)
  • Conservatively scan the stack for pointers to GC-managed memory and don't move those objects.
  • Don't use a GC that moves stuff around in memory. (This seems like the best approach.)

The amazing speed of a copying, generational GC depends on its being able to move all objects out of a generation. Then that whole region of memory is available for new allocations, which leads to an incredibly fast allocation routine—a few instructions in some cases. This kind of design is not viable for Mozilla, given our dependence on C++.

The GC should have conservative stack scanning. This makes life a lot easier for developers.

The GC has to support existing uses of threads in Mozilla. This pushes us toward using the JS request model.

The GC needs incremental or generational collection to avoid long pauses, which would regress perceived responsiveness.

Interoperability with languages other than C/C++ and JavaScript is not a priority. Python will have to catch up later, if anyone is willing and able to carry it forward.

Building XPCOMGC

XPCOMGC work is currently taking place in the following Mercurial repository: http://hg.mozilla.org/users/bsmedberg_mozilla.com/gcmonkey

It is being maintained as a linear sets of changes on top of mozilla-central, rebased relatively frequently. Thus you will see old heads in the repository, and can ignore them (they are dead heads, though Mercurial doesn't have a way to notate that).

Current Status

  • The build only works on Linux
  • Boehm is inserted as a replacement for the C allocator malloc/free
  • malloc/free allocations are treated as "uncollectable". That is, they are scanned for pointers but are not subject to being freed by the collector
  • XPCOM string buffers (nsStringBuffer) have been made collectable. They are no longer refcounted, but instead are made immutable-on-share
  • The build runs and seems to perform "ok"... still trying to get quantified performance numbers
  • Memory usage is 50-100% worse than using the jemalloc allocator

Old Information

A previous attempt at this project was made using the MMgc convervative collector. Because this collector requires programmatic write barriers, and for other reasons, this attempt was abandoned (though we learned a lot!)

TODO: collect/format information from the newsgroup discussion.