Firefox/Projects/Startup Time Improvements Notes

From MozillaWiki
Jump to: navigation, search


Emails From Vlad on Startup

Importance of Cold Start

Hey guys,

Here are the details on getting a cold start (no FS cache) for perf

MacOS X:

  echo 3>  /proc/sys/vm/drop_caches

  Much trickier.

  First, if you're on Vista/7, you need to delete the preload cache
data.  In your windows dir, ... argh, can't find it atm, but there's a
precache or preload or something, and there should be a firefox.exe-like
dir inside it that you need to delete.  Otherwise vista/7 will start
doing its app preload acceleration stuff which will screw over the data
you're trying to collect.

  Then, grab something like flushmem:  There's another
tool that's more flexible and might be faster, but that one should get
the job done.

  Then run the app.

Gotta do these steps before each start to simulate cold start.

    - Vlad 

The purge command on Linux, echo 3> /proc/sys/vm/drop_caches, requires root privileges. [ddahl via adw]

Startup post on dev-apps-firefox

Shortly before our office move, we kicked off an effort to take a hard
look at our startup time, to both understand what we all do, and to
figure out how to improve it. zpao (Paul O'Shannessy), ddahl (David
Dahl), and I have been working towards a few goals:

- Document how to reproducibly get a cold and warm startup on
  Windows (XP/Vista/7), MacOS X, and Linux

- Create tools to capture both JS execution during startup, as well as
  file IO

- Add instrumentation to firefox to identify "big blocks" of startup
  for timing

- Create tools to visualize the captured data in a way that's easy to

One thing that's fairly obvious with playing with startup is that
"warm" startup is significantly faster than "cold" startup; that is,
when you've launched Firefox before, the OS caches a bunch of the data
off the disk, and it doesn't have to hit the disk again. This
directly points to IO being a major component of our startup time,
which is why IO is part of the capture above.  This is a pretty big
problem even on desktop systems; on my fairly beefy Windows 7 box, a
cold startup takes upwards of 12 seconds (!); warm startup is also
fairly slow if the system is under load.

We've fixed some bugs in our dtrace javascript provider along the way
(bug 403345),
so dtrace will actually give correct (and sane) data now. Also, I've
been doing a lot of work with Microsoft's xperf (part of the Windows
Performance Toolkit), which can capture much the same data. (In
theory we should be able to create JS providers for xperf as well, but
that's out of scope for this particular project.)

One example of the type of data we're capturing and tools that we're
building is --
this is just a quick io capture with xperf, with the data dumped into
a Timeline widget from the SIMILE project. (The time scales are a bit
off; the raw data is in microseconds, but SIMILE only handles
milliseconds... so all times need to be divided by 1000, which becomes
a problem when you go over 60 seconds -- which is actually just 60 ms!
Something that we'll fix.)

Another example is the result of a startup trace; zpao is still
working on the visualization and data capture, but you can see an
early version at -- the
"Exclusive function elapsed times" view will provide the most accurate
data, basically telling you "how long did we spend in a given
function, ignoring all descendants". In this view, the "null"
filename dominates, generally indicating native code. And within
that, calls to "getService" also dominate, which indicates that much
of the time is spent within getService, presumably initializing
whatever the requested service is.

In the future, we hope to have hierarchy correctly represented in the
inclusive view, as well as adding IO operations as part of that
hierarchy. Also, these tools aren't really limited to analyzing startup;
they will hopefully form the basis of a set of javascript performance
analysis tools that we can apply to any browser operation.

Besides IO and JS, Taras Glek found in earlier examinations of startup
that loading CSS/XBL/etc. was taking a significant amount of time.
We're working on instrumenting those parts of the code as well, so that
we can capture it along with the raw js/io/etc. portions.

Is there any other data that we should be capturing?  Let us know, and
we'll see if we can figure out how to add it in.  I'll keep posting
updated data as we have it, and will probably create a web page to
collect it all -- at that point it'll be open season on any issues that
can be identified.

    - Vlad 

Rob Arnold notes on simulated cold startup on Windows

[14:00]	<robarnold>	for the disk, you open the disk and flush it...
[14:00]	<robarnold>	for the cache, there's a sysinternals utility
[14:01]	<robarnold>	taras: see
[14:02]	<robarnold>	there's also which someone on vlad's blog found
[14:05]	<robarnold>	ok. your results will probably be tainted by the windows feature that predicts file io for a process based on past runs (don't remember the name)
[14:07]	<robarnold>	ah, I think it's in \Windows\Prefetch (at least it seems to be so on windows 7)
[14:07]	<robarnold>	note that you might not have access to that folder since the systems, not the administrator, owns it
[14:07]	<robarnold>	*system
[14:17]	<sid0>	taras, robarnold: suggest you disable the prefetcher altogether
[14:17]	<sid0>	HKEY_LOCAL_MACHINE\SYSTEM\CurrentC­ontrolSet\Control\Session Manager\Memory Management\Prefetcher
[14:18]	<sid0>	set EnablePrefetcher to 0 and (on Vista and above) EnableSuperfetch to 0
[14:18]	<sid0>	this might also need a reboot + clearing out of windows\prefetch

On XP at least the final fragment of the regkey mentioned by sid0 is slightly different. Not sure whether it's different from Vista/7 or he was just remembering wrong: [adw]

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters

(sorry, I remembered wrong. -- sid0)

The default value of the EnablePrefetcher key is 3 (on XP at least). [adw]

(it's the same on Vista/7 -- sid0)

adw's Windows XP experience

The notes above point to three tools for purging disk cache on Windows:

  • CacheSet from Microsoft
    • Uses a system call to request that the working set of the system's cache be cleared.
  • purge.exe from Silver
    • This appears to be equivalent to CacheSet.
  • flushmem.exe from Chad Austin
    • Allocates memory in 64 KiB chunks until it can't anymore, and then writes to each page, forcing older pages out to the page file.

I noticed no difference between starting Firefox warm and starting it after using both CacheSet and purge.exe. Whatever they may do, combined with disabled prefetch they are not sufficient to simulate cold startup.

After using flushmem.exe, Firefox starts up in about the same time it takes for it to startup cold, but it ground my system to a halt for nearly ten minutes.

These were Vlad's experiences as well:

[12:06pm] dietrich: vlad: what's the recommended way to force cold-start on windows?
[12:08pm] vlad: all the stuff that people suggested doesn't work
[12:08pm] vlad: with cacheset/purge/etc.
[12:08pm] vlad: so I have no idea, other than reboot 
[12:09pm] vlad: reboot and turning off all the prefetching
[12:09pm] adw: vlad: is not correct?
[12:13pm] vlad: -that- sadly is
[12:13pm] vlad: flushmem would work
[12:13pm] vlad: but it's faster to reboot
[12:13pm] vlad: since flushmem causesyour system to grind to a halt for a few minutes, even more if you have lots of memory

Quoting a Microsoft software tester on this MSDN forum thread:

This is actually a very complicated thing to do, and to do it correctly, these are some of the things you need to worry about:

  • Invalidating the CPU caches
  • Invalidating the cache on the storage media
  • Invalidating the OS's read cache (note that this is not the same as the write cache which can be flushed with Sync.exe via FlushFileBuffers)
  • Removing items from the OS's KnownDLL cache (Larry, KB)
  • Removing items from the CLR JIT compilation cache (.NET apps only)