Emails From Vlad on Startup
- Importance of Cold Start
Hey guys,
Here are the details on getting a cold start (no FS cache) for perf
analysis:
MacOS X:
sync
purge
Linux:
sync
echo 3> /proc/sys/vm/drop_caches
Windows:
Much trickier.
First, if you're on Vista/7, you need to delete the preload cache
data. In your windows dir, ... argh, can't find it atm, but there's a
precache or preload or something, and there should be a firefox.exe-like
dir inside it that you need to delete. Otherwise vista/7 will start
doing its app preload acceleration stuff which will screw over the data
you're trying to collect.
Then, grab something like flushmem:
http://aegisknight.org/2009/04/flushing-disk-cache/ There's another
tool that's more flexible and might be faster, but that one should get
the job done.
Then run the app.
Gotta do these steps before each start to simulate cold start.
- Vlad
- Startup post on dev-apps-firefox
Shortly before our office move, we kicked off an effort to take a hard
look at our startup time, to both understand what we all do, and to
figure out how to improve it. zpao (Paul O'Shannessy), ddahl (David
Dahl), and I have been working towards a few goals:
- Document how to reproducibly get a cold and warm startup on
Windows (XP/Vista/7), MacOS X, and Linux
- Create tools to capture both JS execution during startup, as well as
file IO
- Add instrumentation to firefox to identify "big blocks" of startup
for timing
- Create tools to visualize the captured data in a way that's easy to
analyze
One thing that's fairly obvious with playing with startup is that
"warm" startup is significantly faster than "cold" startup; that is,
when you've launched Firefox before, the OS caches a bunch of the data
off the disk, and it doesn't have to hit the disk again. This
directly points to IO being a major component of our startup time,
which is why IO is part of the capture above. This is a pretty big
problem even on desktop systems; on my fairly beefy Windows 7 box, a
cold startup takes upwards of 12 seconds (!); warm startup is also
fairly slow if the system is under load.
We've fixed some bugs in our dtrace javascript provider along the way
(bug 403345),
so dtrace will actually give correct (and sane) data now. Also, I've
been doing a lot of work with Microsoft's xperf (part of the Windows
Performance Toolkit), which can capture much the same data. (In
theory we should be able to create JS providers for xperf as well, but
that's out of scope for this particular project.)
One example of the type of data we're capturing and tools that we're
building is
http://people.mozilla.com/~vladimir/misc/startviz/startviz.html --
this is just a quick io capture with xperf, with the data dumped into
a Timeline widget from the SIMILE project. (The time scales are a bit
off; the raw data is in microseconds, but SIMILE only handles
milliseconds... so all times need to be divided by 1000, which becomes
a problem when you go over 60 seconds -- which is actually just 60 ms!
Something that we'll fix.)
Another example is the result of a startup trace; zpao is still
working on the visualization and data capture, but you can see an
early version at http://playground.zpao.com/dtrace_treemaps2/ -- the
"Exclusive function elapsed times" view will provide the most accurate
data, basically telling you "how long did we spend in a given
function, ignoring all descendants". In this view, the "null"
filename dominates, generally indicating native code. And within
that, calls to "getService" also dominate, which indicates that much
of the time is spent within getService, presumably initializing
whatever the requested service is.
In the future, we hope to have hierarchy correctly represented in the
inclusive view, as well as adding IO operations as part of that
hierarchy. Also, these tools aren't really limited to analyzing startup;
they will hopefully form the basis of a set of javascript performance
analysis tools that we can apply to any browser operation.
Besides IO and JS, Taras Glek found in earlier examinations of startup
that loading CSS/XBL/etc. was taking a significant amount of time.
We're working on instrumenting those parts of the code as well, so that
we can capture it along with the raw js/io/etc. portions.
Is there any other data that we should be capturing? Let us know, and
we'll see if we can figure out how to add it in. I'll keep posting
updated data as we have it, and will probably create a web page to
collect it all -- at that point it'll be open season on any issues that
can be identified.
- Vlad