Performance/Optimizing JavaScript with DTrace
From MozillaWiki
Requirements
- an operating system that includes DTrace, like Mac OS X Leopard (10.5) or Solaris Express Developer Edition
- a build of Firefox that has had DTrace enabled with the --enable-dtrace configure option, for example by adding this line to your mozconfig file:
ac_add_options --enable-dtrace
Recommendations
- the DTrace Toolkit, which contains a bunch of useful JavaScript DTrace scripts so you don't have to roll your own
- Standalone Talos, a testing framework you can use to generate consistent test runs so you can compare profiles before and after making a change
- names for all suspect methods, including getters (but see bug 403379), so you can find their entries in the profiler output, i.e.:
var MyObject = { foo: function MyObject_foo() {...}, get bar MyObject_get_bar() {...} }
Example Profiling Session
- As root, run the DTrace Toolkit's js_calltime.d script (which profiles JavaScript elapsed time by function) and redirect output to a log file:
js_calltime.d > calltime.log
- Start the browser and do something that invokes the suspect functions, either manually or by kicking off a Talos test run.
- Shut down the browser, stop the js_calltime.d script by pressing Ctrl-c in its terminal, and grep the log file for the output headers (Count, Elapsed Times, Exclusive function elapsed times, Inclusive function elapsed times) as well as the suspect function names.
- For example, to look at functions involved in the site-specific full zoom setting, you might use the command grep 'Count\|Elapsed\|clusive\|FullZoom\|ContentPref\|ZoomManager' calltime.log and get output like the following:
Count, ... nsContentPrefService obj-new XPCWrappedNative_NoHelper 320 nsContentPrefService obj-new Object 1120 nsContentPrefService obj-new Function 2520 Elapsed times (us), ... nsContentPrefService obj-new XPCWrappedNative_NoHelper 2269 nsContentPrefService obj-new Object 7690 nsContentPrefService obj-new Function 17583 Exclusive function elapsed times (us), ... nsContentPrefService func ContentPrefService__dbCreateStatement 24923 nsContentPrefService func ContentPrefService__dbInit 33603 nsContentPrefService func getService 48392 Inclusive function elapsed times (us), ... nsContentPrefService func ContentPrefService__init 187897 browser.js func FullZoom_onLocationChange 468748 browser.js func FullZoom_init 1445655
Tips
- Run profiles with various parts of a function commented out to figure out which parts are primarily responsible for the cost of the function.
- Keep a log, a diff file, and some notes each time you make a change so you can retrace your steps if your changes don't work out.
- Question and test your assumptions. For example, I thought it would be faster to check the value of the full zoom setting when loading a new page and only set it to the new page's pref value if it differed from the old page's value, but testing showed that it was faster to just always set the setting to the new page's value, even when the two values were the same (i.e. the cost of always getting the value outweighed the savings of not always setting it).
- Distinguish between critical functionality and functionality which is merely useful, and push the latter to extensionland, where it will only cause a performance hit for users who want it.
- To improve startup performance, delay invoking code as long as possible, until delayedStartup or even later.
- Create XPCOM services lazily, and cache frequently accessed ones using memoizing getters, either by replacing the getter with a property:
o = { get _observerSvc() { let svc = Cc["@mozilla.org/observer-service;1"]. getService(Ci.nsIObserverService); delete this._observerSvc; this._observerSvc = svc; return this._observerSvc; } };
Or, for properties whose memoizing getter is in a prototype, by shadowing the prototype getter with an instance getter:
function O() {} O.prototype = { get _observerSvc() { let svc = Cc["@mozilla.org/observer-service;1"]. getService(Ci.nsIObserverService); this.__defineGetter__("_observerSvc", function() svc); return this._observerSvc; } }
- Inline logic that gets called frequently to eliminate the cost of function invocation.
- Dig into costly XBL and native code, since it might be doing unnecessary work. For example, when optimizing the site-specific full zoom setting, I found that browser.xml wasn't memoizing its frequently-accessed _docShell property, even though the value of that property doesn't change over the lifetime of a browser widget, and I also found that nsPresContext::SetFullZoom was doing some unnecessary work when the new value was the same as the old one.
Bugs
- bug 403379 - DTrace js_calltime.d output omits getters
- bug 403348 - DTrace js_calltime.d output omits interface methods
- bug 403345 - DTrace js_calltime.d output includes functions not defined in JS file
-
bug 403132 - DTrace function probes are double-counting invocations