Personal tools

Performance:Leak Tools

From MozillaWiki

Jump to: navigation, search

Contents

Strategy for finding leaks

When trying to make a particular testcase not leak, I recommend focusing first on the largest object graphs (since these entrain many smaller objects), then on smaller reference-counted object graphs, and then on any remaining individual objects or small object graphs that don't entrain other objects.

Because (1) large graphs of leaked objects tend to include some objects pointed to by global variables that confuse GC-based leak detectors, which can make leaks look smaller (as in bug 99180) or hide them completely and (2) large graphs of leaked objects tend to hide smaller ones, it's much better to go after the large graphs of leaks first.

A good general pattern for finding and fixing leaks is to start with a task that you want not to leak (for example, reading email). Start finding and fixing leaks by running part of the task under nsTraceRefcnt logging, gradually building up from as little as possible to the complete task, and fixing most of the leaks in the first steps before adding additional steps. (By most of the leaks, I mean the leaks of large numbers of different types of objects or leaks of objects that are known to entrain many non-logged objects such as JS objects. Seeing a leaked GlobalWindowImpl, nsXULPDGlobalObject, nsXBLDocGlobalObject, or nsXPCWrappedJS is a sign that there could be significant numbers of JS objects leaked.)

For example, start with bringing up the mail window and closing the window without doing anything. Then go on to selecting a folder, then selecting a message, and then other activities one does while reading mail.

Once you've done this, and it doesn't leak much, then try the action under trace-malloc or Purify or Valgrind to find the leaks of smaller graphs of objects. (When I refer to the size of a graph of objects, I'm referring to the number of objects, not the size in bytes. Leaking many copies of a string could be a very large leak, but the object graphs are small and easy to identify using GC-based leak detection.)

What leak tools do we have?

Tool Finds Platforms Requires
Leak tools for large object graphs
Leak Gauge Windows, documents, and docshells only All platforms Any build
Leak Monitor Common chrome JavaScript leaks All platforms Any build
Cycle Collector debugging Windows, documents, content nodes, ...  ? Build with DEBUG_CC
JavaScript heap dump  ? All platforms Debug build
Cycle collector heap dump (includes JS heap dump) JS objects, DOM objects, many other kinds of objects All platforms Any build
Leak tools for medium-size object graphs
Trace-refcnt Objects that implement nsISupports or use MOZ_COUNT_CTOR All tier 1 platforms Debug build (or build opt with --enable-logrefcnt)
Leaksoup All objects? (or allocations?) All tier 1 platforms Build with --enable-trace-malloc
Leak tools for simple objects and summary statistics
Trace-malloc All allocations All tier 1 platforms Build with --enable-trace-malloc
Purify All allocations Linux, Solaris, Windows Any build
Valgrind All allocations Mac, Linux Any build, but --enable-valgrind is recommended
Apple tools  ? Mac Any build
Leak tools for debugging memory growth that is cleaned up on shutdown
diffbloatdump All allocations Linux only? Build with --enable-trace-malloc

Leak tools for large object graphs

leak-gauge

Leak gauge is a script (available as a perl script or as HTML and JavaScript that must be run from a local file) that post-processes an log taken by setting environment variables in a release build. It is designed to assist in detecting what leaks of large object graphs occur during normal browsing activity. The logging can be run (as described in the script) during normal browsing without significant overhead. Then the script can be run to provide information about the documents, window objects, and docshells that leaked. See Jesse Ruderman's blog entry for more details including a list of other links.

You can find also how-to documentation at QA:Home_Page:Firefox_3.0_TestPlan:Leaks:LeakTesting-How-To.

leak-monitor

Leak Monitor is an extension that detects (when closing windows) the primary ways in which JavaScript code can cause leaks that are not bugs in the native code core. An example of such a bug would be registering (in each window) a JavaScript-implemented observer with the global observer service and never removing the observers, leaving JavaScript code running in the context of each window registered as an observer until the user quits the browser.

When a leak is detected, the extension presents the user (or application/extension developer!) with a dialog with information about the leaked objects. (The alerts can also be triggered by bugs in the core code. See the extension's homepage for more details.)

JavaScript heap dump

Setting the XPC_SHUTDOWN_HEAP_DUMP environment variable to a file name will cause XPConnect to, at shutdown, dump a log to that file explaining why all remaining JS objects are still alive. This may be sufficient to debug some JS-related leaks without the full rebuild required by DEBUG_CC. It also shows more information about reachability, since it shows property names for all connections.

Cycle collector heap dump

The cycle collector heap dump is useful for figuring out why the cycle collector is keeping an object alive. These can either be manually or automatically generated. They can be generated in both debug and non-debug builds.

To manually generate a CC dump on Firefox 29 and up, go to about:memory and use the buttons under "Save GC & CC logs." "Save concise" will generate a smaller CC log, "Save verbose" will provide a more detailed CC log.

To manually generate a CC dump prior to Firefox 29, enable the Error Console by going to about:config and setting devtools.errorconsole.enabled to true. Then open a new window, go to Tools, Web Developer, Web Console.

Then evaluate this expression:

window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).
  getInterface(Components.interfaces.nsIDOMWindowUtils).
  cycleCollect(Components.classes["@mozilla.org/cycle-collector-logger;1"]
    .createInstance(Components.interfaces.nsICycleCollectorListener))

By default, the cycle collector will only log the objects it normally looks at. Sometimes it can be useful to disable the optimizations the cycle collector does in order to get more detailed information. In this case, the method allTraces() can be used:

window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).
  getInterface(Components.interfaces.nsIDOMWindowUtils).
  garbageCollect(Components.classes["@mozilla.org/cycle-collector-logger;1"]
    .createInstance(Components.interfaces.nsICycleCollectorListener).allTraces())

This creates a file named cc-edges-NNNN.log and writes a dump of the heap known to the cycle collector, which includes JS objects and also native C++ objects that participate in cycle collection, to the file. It will also log the contents of the Javascript heap to a file named gc-edges-NNNN.log.

One can override the default location of the log files by setting the MOZ_CC_LOG_DIRECTORY environment variable. http://people.mozilla.com/~mleibovic/cc-dump.xpi is an addon for Android Firefox which uses that to save files to /sdcard. The code for this addon is available on Github. By default, the file is created in some temp directory, and the path to the file is printed to the Error Console.

To log every cycle collection, set the MOZ_CC_LOG_ALL environment variable (XPCOM_CC_LOG_ALL in 29 and earlier). To log only shutdown collections, set MOZ_CC_LOG_SHUTDOWN (XPCOM_CC_LOG_SHUTDOWN in 29 and earlier). To make any shutdown CCs AllTraces() at shutdown, set MOZ_CC_ALL_TRACES_AT_SHUTDOWN (XPCOM_CC_ALL_TRACES_AT_SHUTDOWN in 29 and earlier). The latter two are useful for debugging shutdown leaks.

In 30 and later, set the environment variable MOZ_CC_LOG_THREAD to main to only log main thread CCs, or to worker to only log worker CCs. The default value is all, which will log all CCs.

To analyze a cycle collector dump, you need the scripts from Github. The relevant scripts are find_roots.py and parse_cc_graph.py (which is called by find_roots). Calling find_roots on a CC file with a specific object or kind of object will produce paths from rooting objects to the specified objects. Most big leaks include an nsGlobalWindow, so that's a good class to try if you don't have any better idea.

To fix a leak, the next step is to figure out why the rooting object is alive. For a C++ object, you need to figure out where the missing references are from. For a JS object, you need to figure out why the JS object is reachable from a JS root. For the latter, you can use the corresponding find_roots for JS on the gc-edges file generated by the CC dump.

Addons for creating and analyzing cycle collector graphs

Leak tools for medium-size object graphs

trace-refcnt and the refcount balancer

The refcount balancer consists of the nsTraceRefcnt code in xpcom/base and the perl scripts in tools/rb/. It works by instrumenting refcounting, so every AddRef and Release method created using NS_IMPL_ISUPPORTSn and variants, NS_IMPL_ADDREF, or NS_IMPL_RELEASE is automatically instrumented. It also logs information for non-refcounted objects instrumented using the MOZ_COUNT_CTOR and MOZ_COUNT_DTOR macros.

Because it is based on instrumentation, it is not reliable for gathering aggregate statistics. (In spite of this, it is used for the "RLk" leak stats on tinderbox, so the trace-malloc-based "Lk" numbers are more meaningful.) However, it is by far the best tool we have for debugging leaks of reference counted objects, which are the leaks in Mozilla that can entrain the largest object graphs.

These tools work on Windows, Mac (PPC and Intel), and Linux (x86), although nsCOMPtr logging doesn't work on Windows (?) and Mac and Linux stack traces require some post-processing (see below). It can be used on any standard --enable-debug build, or on a --disable-debug build with --enable-logrefcnt.

Getting leak statistics. To enable summary statistics, simply set XPCOM_MEM_LEAK_LOG to 1 (stdout), 2 (stderr), or a filename. In this mode, trace-refcnt will tell you what types of objects leaked, but won't tell you why they leaked.

Using the refcount balancer. See the tutorial on finding leaks of XPCOM objects for information on debugging leaks using the refcount balancer.

Leaksoup

Leaksoup is a trace-malloc tool that analyzes the log from trace-malloc's secondary feature, the ability to dump allocations. (See below for more information on using trace-malloc.) trace-malloc's main log is a log that contains information about all allocations and the stacks at which they were allocated and freed, but it also has the ability to dump, at a given time (including shutdown), all the currently live allocations, the stacks at which they were allocated, and the contents of the memory. This last feature allows leaksoup to analyze the graph of live objects and determine which allocations are roots (within that graph, of course -- stack allocations and global variables don't count). Leaksoup also finds sets of objects that are rooted by a cycle (i.e., a set of reference counted objects that own references to each other in a cycle). However, it cannot distinguish between owning and non-owning pointers, which means that non-owning pointers that are nulled out by destructors may show up in leaksoup as cycles. However, despite that, it is probably the easiest way to determine what leak roots are present.

You need a build with --enable-trace-malloc. Without this option, you can't create a dump and leaksoup isn't built.

Run using both the --trace-malloc and --shutdown-leaks options, for example "./mozilla -P default --trace-malloc=malloc.log --shutdown-leaks=sdleak.log". Ignore the malloc.log file (unless you're interested in other trace-malloc tools, such as SpaceTrace).

Then run leaksoup over the memory dump (which is a dump of all allocations still live at shutdown) with a command such as ./run-mozilla.sh ./leaksoup sdleak.log > sdleak.html. This generates a large HTML file as output.

The output of leaksoup begins with all the leak roots, and then lists all the non-root allocations. The roots are either listed as single objects or as strongly connected components (minimal sets of nodes in the graph in which any node is reachable from all other nodes). (A strongly connected component with only one node is listed as a single object.) Any single object listed as a root is really a leak root, and any component listed as a root either (a) contains an object that is a root or (b) contains objects that form an ownership cycle that is a root.

Leak tools for simple objects and summary statistics

Trace-malloc

The trace-malloc code consists of the nsTraceMalloc code in mozilla/tools/trace-malloc/lib/ and the perl scripts and trace-malloc readers in mozilla/tools/trace-malloc/. It works by overriding / hooking into malloc and free (and related functions) and logging all calls, with stacks. It is probably the best tool we currently have for gathering aggregate memory usage statistics. It is generally more useful for bloat measurement than leak measurement, but it can be used to find leaks, including leaks of objects not instrumented for trace-refcnt. It works on Windows, Linux (x86), and Mac (PPC and Intel). It can be built by checking out mozilla/tools/trace-malloc/ and building with --enable-trace-malloc.

Trace malloc slows down the browser a bit (maybe 3x?) at runtime on Linux and Mac, but by a lot more on Windows. However, the stacks on Linux and Mac require post-processing.

Generate a trace-malloc dump by building with ac_add_options --enable-trace-malloc and then passing --trace-malloc FILENAME when you start Firefox.


Purify

Purify is a commercial tool that detects a number of types of problems, including leaks. I think it works internally rather like the Boehm GC. That is, it reports a subset of real leaks during the running of the app (while hiding many real leaks because they are rooted through globals), and then reports all leaks at shutdown. Things it reports as potential leaks (PLKs) should be ignored -- they just mean that an object is being held only by an pointer to an offset within it (which happens when an XPCOM object implements multiple interfaces (that don't inherit from each other) and is held only by a pointer to an interface other than the first). It is available on Windows and Solaris.

Valgrind

Valgrind is a free tool for Linux and Mac that (like Purify) detects a number of program errors at runtime. Like Purify, its leak reporting is only useful for the simplest types of leaks (those not involving reference counting or large object graphs).

Apple tools

Apple provides some tools for Mac OS X that probably report similar problems to those reported by Purify and Valgrind. The "leaks" tool is not recommended for use with SpiderMonkey or Firefox, because it gets confused by tagged pointers and thinks objects have leaked when they have not (see bug 390944).

Leak tools for debugging memory growth that is cleaned up on shutdown

It is also possible to have a leak that is visible to the user (but not to many of our leak detection tools) by holding objects longer than one should. (For example, one could store an owning reference to every document ever loaded in an nsISupportsArray owned by a service that is destroyed at shutdown, causing every document to stay around until shutdown.) It is sometimes worth testing for this type of leak, especially if there are known leak problems that are visible to the user.

trace-malloc with diffbloatdump

The best way I know to do this is with trace-malloc. In a build with trace-malloc enabled you can dump the existing allocations to a file by calling the function TraceMallocDumpAllocations from JavaScript (which is equivalent to calling NS_TraceMallocDumpAllocations from C. The following web page will allow dumping of allocations:

<script type="text/javascript">
    var filename = window.prompt("Filename to log: ");
    if (filename)
      TraceMallocDumpAllocations(filename);
</script>

One can then use the script mozilla/tools/trace-malloc/diffbloatdump.pl to compare trace-malloc dumps before and after doing an action that might leak. If there are significant differences, it might be worth examining the call stacks for the destructors of the objects in question to see what is extending their lifetime. (This can be done by examining the unprocessed output of an XPCOM_MEM_REFCNT_LOG, one of the nsTraceRefcnt logs.) You should use the --use-address argument to diffbloatdump.pl, and then the diff tree can be run through fix-linux-stack / fix-macosx-stack as appropriate.

Common leak patterns

When trying to find a leak of reference-counted objects, there are a number of patterns that could cause the leak:

  1. Ownership cycles. The most common source of hard-to-fix leaks is ownership cycles. If you can avoid creating cycles in the first place, please do, since it's often hard to be sure to break the cycle in every last case. Sometimes these cycles extend through JS objects (discussed further below), and since JS is garbage-collected, every pointer acts like an owning pointer and the potential for fan-out is larger. See bug 106860 and bug 84136 for examples. (Is this advice still accurate now that we have a cycle collector? --Jesse)
  2. Dropping a reference on the floor by:
    1. Forgetting to release (because you weren't using nsCOMPtr when you should have been): See bug 99180 or bug 93087 for an example or bug 28555 for a slightly more interesting one. This is also a frequent problem around early returns when not using nsCOMPtr.
    2. Double-AddRef: This happens most often when assigning the result of a function that returns an AddRefed pointer (bad!) into an nsCOMPtr without using dont_AddRef(). See bug 76091 or bug 49648 for an example.
    3. [Obscure] Double-assignment into the same variable: If you release a member variable and then assign into it by calling another function that does the same thing, you can leak the object assigned into the variable by the inner function. (This can happen equally with or without nsCOMPtr.) See bug 38586 and bug 287847 for examples.
  3. Dropping a non-refcounted object on the floor (especially one that owns references to reference counted objects). See bug 109671 for an example.
  4. Destructors that should have been virtual: If you expect to override an object's destructor (which includes giving a derived class of it an nsCOMPtr member variable) and delete that object through a pointer to the base class using delete, its destructor better be virtual. (But we have many virtual destructors in the codebase that don't need to be -- don't do that.)

Debugging leaks that go through XPConnect

Many large object graphs that leak go through XPConnect. This can mean there will be XPConnect wrapper objects showing up as owning the leaked objects, but it doesn't mean it's XPConnect's fault (although that has been known to happen, it's rare). Debugging leaks that go through XPConnect requires a basic understanding of what XPConnect does. XPConnect allows an XPCOM object to be exposed to JavaScript, and it allows certain JavaScript objects to be exposed to C++ code as normal XPCOM objects.

When a C++ object is exposed to JavaScript (the more common of the two), an XPCWrappedNative object is created. This wrapper owns a reference to the native object until the corresponding JavaScript object is garbage-collected. This means that if there are leaked GC roots from which the wrapper is reachable, the wrapper will never release its reference on the native object. While this can be debugged in detail, the quickest way to solve these problems is often to simply debug the leaked JS roots. These roots are printed on shutdown in DEBUG builds, and the name of the root should give the type of object it is associated with.

One of the most common ways one could leak a JS root is by leaking an nsXPCWrappedJS object. This is the wrapper object in the reverse direction -- when a JS object is used to implement an XPCOM interface and be used transparently by native code. The nsXPCWrappedJS object creates a GC root that exists as long as the wrapper does. The wrapper itself is just a normal reference-counted object, so a leaked nsXPCWrappedJS can be debugged using the normal refcount-balancer tools.

If you really need to debug leaks that involve JS objects closely, you can get detailed printouts of the paths JS uses to mark objects when it is determining the set of live objects by using the functions added in bug 378261 and bug 378255. (More documentation of this replacement for GC_MARK_DEBUG, the old way of doing it, would be useful. It may just involve setting the XPC_SHUTDOWN_HEAP_DUMP environment variable to a file name, but I haven't tested that.)

Post-processing of stack traces

On Mac and Linux, the stack traces generated by our internal debugging tools don't have very good symbol information (since they just show the results of dladdr). The stacks can be significantly improved (better symbols, and file name / line number information) by post-processing. Stacks can be piped through the scripts mozilla/tools/rb/fix-linux-stack.pl or mozilla/tools/rb/fix-macosx-stack.py to do this. These scripts are designed to be run on balance trees in addition to raw stacks; since they are rather slow, it is often much faster to generate balance trees (e.g., using make-tree.pl for the refcount balancer or diffbloatdump.pl --use-address for trace-malloc) and then run the balance trees (which are much smaller) through the post-processing.

Getting symbol information for system libraries

Windows

Setting the environment variable _NT_SYMBOL_PATH to something like symsrv*symsrv.dll*f:\localsymbols*http://msdl.microsoft.com/download/symbols as described in Microsoft's article. This needs to be done when running, since we do the address to symbol mapping at runtime.

Linux

Many Linux distros provide packages containing external debugging symbols for system libraries. fix-linux-stack.pl uses this debugging information (although it does not verify that they match the library versions on the system).

For example, on Fedora, these are in *-debuginfo RPMs (which are available in yum repositories that are disabled by default, but easily enabled by editing the system configuration).

Leak statistics on tinderbox

Reading the old-style leak stats

The RLk (nsTraceRefcnt-based) leak stats look like this:

L C
RLk:700B

These statistics are collected using nsTraceRefcnt, which as I said above is not very good for aggregate statistics. The action tested is loading of a browser window and a run through the bloat URLs (bloaturls.txt). The RLk (leak) number is the number of bytes of leaks of objects that are logged by nsTraceRefcnt. This is just a subset of objects -- it includes only those objects that use NS_IMPL_ISUPPORTSn and friends or MOZ_COUNT_CTOR and MOZ_COUNT_DTOR. Therefore it doesn't include many of the largest objects, such as string buffers, and it accounts for the size of some other objects incorrectly.

Running the old-style leak tests

The old-style tests can be run on any standard --enable-debug build, or on any --disable-debug build with --enable-logrefcnt.

Mozilla Suite

  1. set the environment variable XPCOM_MEM_LEAK_LOG to leak.log (or XPCOM_MEM_BLOAT_LOG to bloat.log)
  2. ./mozilla -f bloaturls.txt
  3. Look at the top line of bloat.log or leak.log for the aggregate statistics (under headers Bytes/Leaked (for leaks) and Objects/Total (for "bloat"), and look at the other lines for the summary of objects.

Firefox

  1. load resource:///res/bloatcycle.html and tell the popup blocker to allow popups from it
  2. Edit the preferences file or use about:config to set the pref "dom.allow_scripts_to_close_windows" to true
  3. set the environment variable XPCOM_MEM_LEAK_LOG to leak.log (or XPCOM_MEM_BLOAT_LOG to bloat.log)
  4. ./firefox -no-remote resource:///res/bloatcycle.html
  5. Look at the top line of bloat.log or leak.log for the aggregate statistics (under headers Bytes/Leaked (for leaks) and Objects/Total (for "bloat"), and look at the other lines for the summary of objects.

Reading the new-style leak stats

The new-style (nsTraceMalloc-based) leak stats are displayed in the bottom-right panel after clicking on a debug build's "B" and look like this:

These statistics are generated using trace-malloc. They therefore give accurate aggregate statistics for all heap allocations during the test. Like the old-style leak statistics, the action tested is loading of a browser window and a run through the bloat URLs (bloaturls.txt). The Lk (leak) number is the total number of bytes (not counting any overhead in the allocator) allocated on the heap and not freed over the entire run. This number (as does the number in the old-style leak statistics) includes shutdown leaks, leaks that happen only once for a run of the browser, but there are more here since most shutdown leaks are not of objects logged by nsTraceRefcnt. The MH (max heap) is the number of bytes allocated on the heap at the point during the run when the heap was at its maximum size (again, excluding overhead). The A (allocations) number is the total number of allocations over the run, and is an indicator of a subset of performance rather than an indicator of memory use, although high allocation churn could contribute to fragmentation.

Running the new-style leak tests

The new-style (trace-malloc) leak stats require a build with trace-malloc enabled.

Build with trace-malloc enabled (--enable-trace-malloc in your mozconfig file). Then, once you have a build, use the same steps as the old-style leak tests, except don't set the environment variables, and instead add the command line options when invoking mozilla or firefox: "--trace-malloc=malloc.log" (or, if you want the shutdown leaks report, also add "--shutdown-leaks=sdleak.log"). Then, to process the log, run "./run-mozilla.sh ./leakstats malloc.log". (Omit the "./run-mozilla.sh" on Windows.)

This will produce a report like the following:

Leaks: 382739 bytes, 3465 allocations
Maximum Heap Size: 7751799 bytes
62095212 bytes were allocated in 391091 allocations.

Tips

Disabling Arena Allocation

With many lower-level leak tools (particularly trace-malloc based ones, like leaksoup) it can be helpful to disable arena allocation of objects that you're interested in, when possible, so that each object is allocated with a separate call to malloc. Some places you can do this are:

layout engine 
Define DEBUG_TRACEMALLOC_FRAMEARENA where it is commented out in layout/base/nsPresShell.cpp
glib 
Set the environment variable G_SLICE=always-malloc

Other References