Security/Fuzzing/Peach

< Security‎ | Fuzzing

Peach provides a way for one to define the format of data that should be generated and as well as how and when the fuzzed data should be generated. It's a fuzzing platform/framework, not a fuzzer itself. It provides an XML + Python way of quickly creating a fuzzer for a wide variety of data formats and situations.

Peach is a moderately complex and somewhat poorly documented. The documentation tends to lack non-trivial examples and the code and provided tools are sometimes broken. On the other hand, the mailing list is active and the author appears to be responsive.

Here we describe one specific usage of Peach for fuzzing Firefox. The information here is liable to be wrong due to Peach changing or due to lack of experience and understanding of Peach. Please correct mistakes or incorrect information presented here. The goal of describing this usage of Peach is to help others save some time learning things that aren't well documented and see an end-to-end example of browser fuzzing.

Though incomplete, the documentation on the Peach site is very useful. This tutorial is not a replacement for other tutorials or the Peach documentation.

Current Peach version as of writing: 2.3.6

Contents

Overview

Fuzzing is an approach to finding bugs in software by generating a variety of invalid input and passing it to the program. Blind fuzzing, the generation of completely random input, is infrequently useful. If the test input always follows the same code path (e.g. due to the data quickly being seen as invalid), then the testing is not valuable. This is where fuzzing frameworks like Peach come in. Peach allows us to define what the valid data should look like. Peach then uses this definition, often along with a sample valid file we provide, in order generate many interesting variants of invalid data. Before diving into details, here is a high level view of what we're going to do:

  • Install Peach and a few of the dependencies on Linux (Ubuntu 10.04).
  • Create a definition of the data format we want to test.
  • Refine our data definition based on how well Peach can understand a valid file based on our definition.
  • Figure out how we're going to fuzz Firefox. That is, how we repeatedly get Firefox to run hundreds of thousands of fuzzed data sets.
  • Ensure that if the fuzzer does trigger a bug in Firefox, we find out about it and get the information we need to find and fix the bug.
  • Let the fuzzing commence.

Installation on Linux

The Peach documentation and mailing list seem to indicate that Windows is the first-class citizen of Peach. This mostly comes through in better support for attached debuggers and GUI tools. For Linux, there are errors ranging from Peach not having implemented all of its own code for Linux debugging monitors (e.g. missing functions) as well as bugs in the vdb/vtrace modules it uses. So, we're not going to base this on attempts to hack those together. Doing so would seem like a good way to make sure that somebody trying to follow this guide in the future has to first sort out Peach/vdb bugs.

Note: It is recommended to do everything in this tutorial in a VM.

Minimally setting up Peach on Ubuntu 10.04

First, download the Peach source.

Extract the source archive.

unzip Peach-2.3.6.zip
cd Peach-2.3.6

Peach has a handful of dependencies that it ships with. These are in the dependencies/src/ folder. These dependencies provided with Peach are all out-of-date. We'll use a virtualenv for any python modules we can't get from our distro (rather than installing system-wide) so that we don't end up with the aging peach dependencies installed system-wide.

sudo apt-get install build-essential

# python-setuptools for easy_install: you want this installed before
# creating your virtualenv below
sudo apt-get install python-virtualenv python-setuptools python-dev

# Use these rather than the ones Peach provides.
sudo apt-get install python-4suite-xml python-twisted-web

virtualenv ~/peachenv

cd dependencies/src/cDeepCopy
~/peachenv/bin/python setup.py install
cd -

cd dependencies/src/cPeach
~/peachenv/bin/python setup.py install
cd -

~/peachenv/bin/easy_install multiprocessing

Now you should be able to run peach (this should give you a help message).

~/peachenv/bin/python peach.py --help

Note that you can ignore the "Warning: Unix debugger failed to load" message as we aren't using the debugger and so haven't installed it.

Developing Peach XML Files

Peach is a framework that can be used in multiple ways: for generating fuzzed files, for generating fuzzed network traffic, and for fuzzing shared libraries. Examples of these are given in the Peach Quickstart. We're going to do something a little different than what is described in the Peach documentation. What we want is for Firefox to have some of the data it requests be generated by Peach. It may be possible to do this nicely in the standard Peach-style, but trying to keep everything Peach-y seems like a recipe for unmaintainability due to unnecessary complexity. So, instead, we're going to use Peach in the simplest manner possible to generate fuzz data and hack some simple python on top to do the rest.

Peach uses an XML format for describing the fuzzing. In this XML file, you describe the format of the fuzzed data to generate, how and when the data should be generated (i.e. how Peach interacts with outside systems during the fuzzing process and what it does with the fuzzed data), and how to monitor for errors triggered by fuzzing. We're not going to have Peach try to monitor for errors (e.g. by attaching a debugger). A bit of experience with Peach on Linux shows the debugger support to be incomplete and bug-ridden.

Creating a Data Model

The DataModel describes the format of the fuzzed data you want to generate. The more precise the DataModel is, the better the result of fuzzing. Peach will use the DataModel we provide along with an example file we provide in order to generate intelligently-fuzzed data. Key to this is that Peach must be able to interpret the file we provide based on the DataModel we provide. If it can't see how the DataModel describes the file, then the fuzzed data it generates will be very simple and not very good for fuzzing purposes.

The way to develop the DataModel with minimal frustration is to start simple, check how well Peach can interpret it against a simple example file, and then slowly increase the complexity of the model while continuously testing the model. The process of interpreting valid data according to a provided DataModel is what Peach calls "cracking" data. Here we describe trying to model the WOFF file format using Peach.

We start with the following XML file (woff.xml), which in Peach lingo is called a "pit file":

<?xml version="1.0" encoding="UTF-8"?>
<Peach xmlns="http://phed.org/2008/Peach"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="peach.xsd">
    <Include ns="default" src="file:defaults.xml" />

    <DataModel name="WOFF">
        <Blob name="WOFFHeader" length="44" />
        <Blob name="TableDirectoryEntry" length="20" maxOccurs="100" />
        <Blob name="FontTables" />
    </DataModel>

    <StateModel name="State" initialState="Initial">
        <State name="Initial">
            <Action type="output">
                <DataModel ref="WOFF" />
                <!--
                    Peach will use this file as a starting point for generating fuzzed
                    files. It will therefore need to figure out how this file can be
                    interpereted based on the DataModel specified above (ref="WOFF").
                -->
                <Data fileName="/tmp/SomeRealFont.woff" />
            </Action>
            <Action type="close" />
        </State>
    </StateModel>

    <Test name="TheTest">
        <StateModel ref="State" />
        <Publisher class="file.FileWriter">
            <!-- Peach will write each generated fuzzed file to this path. -->
            <Param name="fileName" value="/tmp/fuzzfont.woff" />
        </Publisher>
    </Test>

    <Run name="DefaultRun">
        <Test ref="TheTest" />
        <!-- Configure a logger to store collected information -->
        <Logger class="logger.Filesystem">
            <Param name="path" value="/tmp/peach.log" />
        </Logger>
    </Run>
</Peach>

The DataModel defined above is very simple. We'll expand it in it a moment. The details of the other parts of the file (StateModel, Action, Test, Run, Publisher, along with others we aren't using in this example) are described in the Peach pit file documentations.

To test/debug our DataModel defined above (that is, to determine how well Peach can understand the file /tmp/SomeRealFont.woff listed in the Action based on the DataModel), we run the following command:

~/peachenv/bin/python peach.py -1 --debug woff.xml

You'll see a lot of output run by:

[*] Performing single iteration
[*] Optmizing DataModel for cracking: 'WOFF'
[*] Cracking data from /tmp/SomeRealFont.woff into WOFF
_handleNode(WOFF): template pos(0) >>Enter
_handleNode: Did not find offset relation
---> WOFF (0)
_handleNode(WOFFHeader): blob pos(0) >>Enter
_handleNode: Did not find offset relation
---> WOFFHeader (0)
_handleBlob: No relation found
_handleBlob: Has length
<--- WOFFHeader (2, 0-44)
---] pos = 44

...

*** That worked out!
@@@ Looping, occurs=100, rating=2
@@@ Exiting While Loop
@@@ Returning a rating=2, curpos=2044, pos=2044, newCurPos=2044, occuurs=100
_handleNode(TableDirectoryEntry-0): type=blob, realpos=2044, pos=2044, rating=2
<<EXIT
_handleBlock(WOFF): Rating: (2) ['OS/2\x00\x00\x01\x90\x00\x00\]:
TableDirectoryEntry-0 = [None]
_handleNode(FontTables): blob pos(2044) >>Enter
_handleNode: Did not find offset relation
---> FontTables (2044)
_handleBlob: No relation found
_handleBlob: No length found
<--- FontTables (1, 2044-4704)
---] pos = 4704
_handleNode(FontTables): type=blob, realpos=4704, pos=4704, rating=1 <<EXIT
_handleBlock(WOFF): Rating: (1) ['\xf92\x0b\xc4gF\x1d\x18\xbd\x]: FontTables =
[None]
BLOCK RATING: 2
<--- WOFF (0)
_handleNode(WOFF): type=template, realpos=4704, pos=4704, rating=2 <<EXIT
RATING: 2 - POS: 4704 - LEN(DATA): 4704
Done cracking stuff
[*] Total time to crack data: 1.41
[*] Building relation cache
[*] Starting run "DefaultRun"
[-] Test: "TheTest" (None)
[1:?:?] Element: N/A
        Mutator: N/A

StateEngine.run: State
StateEngine._runState: Initial

StateEngine._runAction: Named_30
Actiong output sending 4704 bytes

StateEngine._runAction: Named_32
-- Completed our iteration range, exiting
[-] Test "TheTest" completed
[*] Run "DefaultRun" completed

A few important points:

  • Our command had the arguments "-1 --debug". That's the number one. This tells Peach to generate one fuzzed output file and then exit. It also tells peach to show debugging information which is useful to figuring out whether Peach liked our DataModel and, if not, what it didn't like.
  • Near the end we see "RATING: 2 - POS: 4704 - LEN(DATA): 4704". This is good. As far as I can tell, a block rating of 2 means success and a block rating of 4 means failure.
  • The time required to crack was greater than a few milliseconds ("Total time to crack data: 1.41"). When the time is <0.02 or so, it often is an indication that cracking failed. That is, that Peach gave up pretty early in the cracking process. However, though a short time is indicative of a problem, a non-short time does not mean success.

If the cracking does fail, you will see lines similar to these near the end:

_handleNode(FontTables): type=blob, realpos=0, pos=0, rating=4 <<EXIT
_handleBlock(WOFF): Rating: (4) [None]: FontTables = [None]
_handleBlock(WOFF): Child rating sucks, exiting
BLOCK RATING: 4
<--- WOFF (0)
_handleNode(WOFF): type=template, realpos=2044, pos=2044, rating=4 <<EXIT
RATING: 4 - POS: 2044 - LEN(DATA): 4704
WARNING: Did not consume all data!!!
Done cracking stuff

Note that rating of 4 and the "WARNING: Did not consume all data!!!". The "Child rating sucks, exiting" isn't necessarily bad, though if it's one of the last things in the output then it's likely you'll see the rating of 4 and lack of consuming all data (and thus failure).

We could now actually use this simple DataModel if we wanted. Let's see what it looks like when we don't tell Peach to stop after one iteration (and not showing debug output):

$ ~/peachenv/bin/python peach.py woff.xml

[*] Optmizing DataModel for cracking: 'WOFF'
[*] Cracking data from /tmp/SomeRealFont.woff into WOFF
[*] Total time to crack data: 1.47
[*] Building relation cache
[*] Starting run "DefaultRun"
[-] Test: "TheTest" (None)
[1:?:?] Element: N/A
        Mutator: N/A

[2:45521:?] Element: N/A
            Mutator: BitFlipperMutator

[3:45521:?] Element: N/A
            Mutator: DataTreeRemoveMutator

[4:45521:?] Element: N/A
            Mutator: BlobMutator

[5:45521:?] Element: N/A
            Mutator: BitFlipperMutator

[6:45521:?] Element: N/A
            Mutator: DataTreeSwapNearNodesMutator

[7:45521:?] Element: N/A
            Mutator: DWORDSliderMutator

[8:45521:?] Element: N/A
            Mutator: BlobMutator
...

Each record in the ouput starts with "[8:45521:?]". The first number is the iteration number Peach is currently on. The second number is the total number of iterations Peach is going to do based on our DataModel and the input file we provided. Note that as the DataModel becomes more detailed, the number of iterations Peach will be able to do goes up. This is because Peach can be more intelligent with manipulating specific parts of the data, and thus the total number of permutations grows. The last part, where the "?" is shown, will become a total time estimate once Peach has an idea of how long it will take to complete all of the iterations.

Let's get back to improving our DataModel. For the time being, we won't show the entire XML file. We'll only show the DataModel portions.

The very simple DataModel we started with was just this:

<DataModel name="WOFF">
    <Blob name="WOFFHeader" length="44" />
    <Blob name="TableDirectoryEntry" length="20" maxOccurs="100" />
    <Blob name="FontTables" />
</DataModel>

That is just saying that there's a 44 byte header blob (blob = binary data that Peach doesn't know whether it expects to be a Number, a String, etc.). Next is a 20-byte TableDirectoryEntry that is repeated up to 1000 times. Next is all of the FontTables. This doesn't give Peach a very good idea of how to crack a file. That is, it has no way to determine how many table directory entries there really are when it cracks the file we provide. Is it zero? Is it 567? This is clearly something we'll want to improve.

If you looked at the WOFF spec, you'd see we actually left out two parts from the end of the file: additional metadata and private data. Depending on how Peach decides to crack the file based on our simple DataModel, these will either end up as part of the FontTables or even partially or fully interpreted as TableDirectoryEntry blobs.

One word of caution: there are two different representations of amount of data: size and length. The attribute "size" means bits. The attribute "length" means bytes.

Let's refine our model so that Peach can more accurately understand the WOFFHeader. This brings up the idea of DataModel templates. When you use the attribute "name" in any of these XML records, it tells Peach it's a real thing. When you use the attribute "ref", it tells Peach that you are using another DataModel as a starting point. An example helps, so here's our DataModel with expanded WOFFHeader:

<DataModel name="WOFFHeaderTemplate">
    <!-- 0x774F4646 == 'wOFF' -->
    <Number name="signature" size="32" signed="false" endian="big"
        value="0x774F4646" />
    <Choice>
        <!-- True Type fonts: 0x00010000 -->
        <Number name="flavor" size="32" signed="false" endian="big"
            value="0x00010000" />
        <!-- CFF fonts: 0x4F54544F == 'OTTO' -->
        <Number name="flavor" size="32" signed="false" endian="big"
            value="0x4F54544F" />
    </Choice>
    <Number name="length" size="32" signed="false" endian="big" />

    <Number name="numTables" size="16" signed="false" endian="big" />

    <Number name="reserved" size="16" signed="false" endian="big"
        value="0" />
    <Number name="totalSfntSize" size="32" signed="false" endian="big" />
    <Number name="majorVersion" size="16" signed="false" endian="big" />
    <Number name="minorVersion" size="16" signed="false" endian="big" />
    <Number name="metaOffset" size="32" signed="false" endian="big" />
    <Number name="metaLength" size="32" signed="false" endian="big" />
    <Number name="metaOrigLength" size="32" signed="false" endian="big" />
    <Number name="privOffset" size="32" signed="false" endian="big" />
    <Number name="privLength" size="32" signed="false" endian="big" />
</DataModel>

<DataModel name="WOFF">
    <Block name="WOFFHeader" ref="WOFFHeaderTemplate" length="44" />
    <Blob name="TableDirectoryEntry" length="20" maxOccurs="100" />
    <Blob name="FontTables" />
</DataModel>

All we've done here is give Peach a more detailed understanding of the 44-byte WOFF header. There's nothing fancy going on here yet other than we did this using a "ref" attribute so just so we could keep the main data model we called WOFF easy to read. Along with this change, we also changed WOFFHeader to be a Block rather than a Blob. If you are using "ref" in this way, you should generally be doing so from Block elements.

The above more-detailed WOFFHeader is fairly self-explanatory. For more info, see the Peach DataModel documentation.

Now that we've made this change, we want to test it again to make sure Peach still can crack our original file. To do so, run peach on the file with the arguments "-1 --debug" again and make sure the output ends by indicating that the file was successfully cracked.

Caution: When developing these DataModels, it helps to think about what Peach might try to do with the information you've provided. Keep in mind that Peach is not a perfect tool. For example, it turns out (as discovered by time-wasted trial-and-error) that Peach isn't always smart enough to determine the size of a block even if there is no ambiguity in the definition and the sizes of each element in the block are defined. Thus, even though Peach should be able to figure out that the WOFFHeader is 44 bytes, it's a good idea to just go ahead and say length="44" and save yourself some possible frustration.

Our next improvement to the DataModel will be to tell Peach that there's a correlation between one of the fields in the WOFFHeader and the number of FontDirectoryEntry blocks it should expect to see. What we do is replace the existing

<Number name="numTables" size="16" signed="false" endian="big" />

with the following:

<Number name="numTables" size="16" signed="false" endian="big">
    <Relation type="count" of="TableDirectoryEntry" />
</Number>

The Relation tag with type="count" tells Peach that this number will equal the number of TableDirectoryEntry blocks that occur.

After this change, we re-test our DataModel with "-1 --debug". It still looks good.

Now we want to provide more detail about the TableDirectEntry blocks (which we still have listed as just 20-byte blobs at the moment). For brevity, the following leaves out the WOFFHeaderTemplate, as we aren't changing that here.

<DataModel name="TableDirectoryEntryTemplate">
    <String name="tag" size="32" signed="false" endian="big" />
    <Number name="offset" size="32" signed="false" endian="big" />
    <Number name="compLength" size="32" signed="false" endian="big" />
    <Number name="origLength" size="32" signed="false" endian="big" />
    <Number name="origChecksum" size="32" signed="false" endian="big" />
</DataModel>

<DataModel name="WOFF">
    <Block name="WOFFHeader" ref="WOFFHeaderTemplate" />
    <Block name="TableDirectoryEntry" ref="TableDirectoryEntryTemplate"
        length="20" maxOccurs="1000" />
    <Blob name="FontTables" />
</DataModel>

This is a very good time to re-check how well Peach can crack a file.

So, now we have Peach to the point where it understands where the individual fields are in the WOFF header, it understand that one of the fields corresponds to the number of table directory entries, and it understand where the individual fields are in those table directory entries. The question is, can we do better?

A great area to improve would be to have Peach understand the offset and compLength fields. From there, even having it understand that the font data each table directory entry refers to is (or can be) compressed and be able to decompress the data and understand the checksum and origLength. However, all of this got very quickly frustrating with Peach. Some of this may seem easier than others (e.g. the compLength, but even that is the length without padding and each entry is 4-byte padded). At this point, one just has to decide whether what you have so far is going to result in worthwhile fuzzing and, if it is, how much additional value do you think you'll get from the more detailed model.

For this file format, we stopped here for now. Let's move on to getting the fuzzing up and running.

Custom Peach Publisher

In Peach, publishers do the work of performing I/O, whether that's writing fuzzer-generated files to disk or interacting with a network service. There are multiple built-in publishers. We're going to use our own, simple publisher rather than those that are provided.

The publisher we want is going to be an HTTP server that will serve requests, some of which will be responded to with fuzzed data. For our situation of WOFF file fuzzing, we want Firefox to load an HTML page provided by the fuzzer which tells Firefox to load a font file that is also provided by the fuzzer.

The following file, httpserver.py, is our custom publisher. This just takes one Peach-generated WOFF file at a time and waits for a request for a WOFF file before grabbing another from Peach.

'''
Simple HTTP server publisher for the Peach fuzzing framework.

@author: Mozilla
@contributor: Justin Samuel <js@justinsamuel.com>
@see: https://wiki.mozilla.org/Security/Fuzzing/Peach
@see: http://peachfuzzer.com/CustomPublisher
'''

import BaseHTTPServer
import Queue
import sys

from Peach.publisher import Publisher


# The address and port that the webserver we run listens on.
SERVER_ADDRESS = 'localhost'
SERVER_PORT = 8111

# After each fuzzed file is served, a copy of it will be stored in this file.
# Thus, if Firefox crashes after requesting the file and no further fuzzed
# files are requested, this will be the file that caused the crash. Note that
# if this is not saved for some reason, it can be regenerated by knowing which
# test number Peach was on when the crash happened. A new round of testing can
# be resumed at that number as long as the xml file passed to peach is the same
# in addition to any files referenced by that xml file being the same (i.e.
# the original file that is being modified to create each fuzz file).
SAVE_LAST_FUZZ_FILENAME = "/tmp/last_fuzzfont.woff"

# The static index file to serve when requests for '/' are received.
INDEX_FILE_DATA = open('/tmp/webroot/index.html').read()

http_server = None

fuzzq = Queue.Queue()


class FuzzHttpServer(BaseHTTPServer.HTTPServer):
    allow_reuse_address = True


class FuzzRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    def do_GET(self):
        # Use self.path to respond different based on the requested path.
        # Use self.server to get the server object.
        if self.path == "/":
            self._serveData(INDEX_FILE_DATA)
        elif self.path.startswith("/fuzzfont.woff"):
            self._fuzzFont()
        else:
            self.send_error(404)

    def _serveData(self, data):
        self.send_response(200)
        self.send_header("Content-Length", len(data))
        self.send_header("Content-Type", "text/html")
        self.end_headers()
        self.wfile.write(data)

    def _fuzzFont(self):
        fuzzdata = fuzzq.get()
        self.send_response(200)
        self.send_header("Content-Length", len(fuzzdata))
        self.send_header("Content-Type", "text/plain")
        # Access control header useful if serving some of the other files
        # from apache, for example, and thus from a different port.
        self.send_header("Access-Control-Allow-Origin", "*")
        self.end_headers()
        self.wfile.write(fuzzdata)
        fp = open(SAVE_LAST_FUZZ_FILENAME, 'wb')
        fp.write(fuzzdata)
        fp.close()


class HttpServerPublisher(Publisher):
    '''
    Each round of generation will result in the following calls:
    start
    connect
    send
    close
    stop
    '''

    def __init__(self):
        global http_server
        server_address = (SERVER_ADDRESS, SERVER_PORT)
        print "Starting server listening at %s:%s" % server_address
        sys.stdout.flush()
        http_server = FuzzHttpServer(server_address, FuzzRequestHandler)
        # Definining withNode prevents some peach error.
        self.withNode = False

    def start(self):
        pass

    def connect(self):
        pass

    def send(self, data):
        '''Peach calls this to provide us the fuzzer-generated data.'''
        fuzzq.put(data)
        print "waiting for next request"
        sys.stdout.flush()
        http_server.handle_request()

    def close(self):
        pass

    def stop(self):
        pass

    def property(self, property, value = None):
        pass

To install this customer publisher, place the file in the Peach/Publishers/ directory and edit the __all__ list in Peach/Publisher/__init__.py to include "httpserver". Note that this is not the recommended way of using custom publishers.

The IP address our custom publisher's HTTP server will listen on is specified in httpserver.py.

This httpserver.py file is also where the path to the static index.html file is specified. For our WOFF fuzzing, the index.html file contains the following:

<html>
<head>
<title>Web Font Sample</title>
<style type="text/css" media="screen, print">
@font-face {
    font-family: "Bitstream Vera Serif Bold";
    src: url("http://localhost:8111/fuzzfont.woff");
}
body {
    font-family: "Bitstream Vera Serif Bold", serif;
}
</style>
<script type="text/javascript">
function load() {
    window.setTimeout('window.location.reload(true);', 200);
}
window.onload = load;
</script>
</head>
<body>
This is text displayed with a fuzzer-generated WOFF font.
</body>
</html>

That is, it will reload itself (avoiding cache) 200ms after the page is loaded. Each reload will end up receiving a different fuzzed WOFF file from our Peach publisher. This fuzzed WOFF file is served from http://localhost:8111/fuzzfont.woff in this example.

Prepare Firefox for Fuzzing

We want to ensure that core dumps will be useful and that failed assertions cause crashes. We're going to use the fact that Firefox crashes as our indication of a discovered bug.

We'll also run Firefox through Valgrind. Note that the setup we're showing below will not cause immediate indication of an error from valgrind. Valgrind doesn't have an option to crash on all errors. One could have valgrind drop into gdb, which might be a better option then letting it continue in some cases. Note that running through valgrind does make this quite slow, and if one really needed to fuzz in a way that required Firefox restarts between each test, then valgrind probably wouldn't be an option due to slowness.

Here's an example .mozconfig to use:

ac_add_options --enable-application=browser

ac_add_options --enable-debug-symbols
# --enable-debug will result in debug code running and more debug output,
# but it will also cause a 300 second sleep at crash which, if one doesn't
# let it finish and just kills the process, will result in no core dump.
# For our current purposes, the extra debug code and output isn't needed.
# At least, I don't think using it will help find more bugs during fuzzing,
# but I may be wrong.
#ac_add_options --enable-debug

ac_add_options --enable-crash-on-assert
ac_add_options --disable-crashreporter

# https://developer.mozilla.org/en/Debugging_Mozilla_with_valgrind
ac_add_options --enable-valgrind
ac_add_options --disable-jemalloc
ac_add_options --disable-optimize
# This level of optimization is unlikely to impact usability of core
# dumps and may significantly improve speed under valgrind. Faster means
# more fuzzing if processing time is the bottleneck.
#ac_add_options --enable-optimize="-O -freorder-blocks"

While getting your fuzzer setup, you'll want a build of Firefox that has bugs your fuzzer will quickly find. So, you'll want to introduce a bug in something your fuzzer is testing. You can do this now or later, but this will be essential at some point. If your fuzzer doesn't properly record/notify about interesting things it finds, then you might miss the fact that it discovered bugs or not have enough data to reproduce or diagnose the issue.

Create a Firefox profile to be used for fuzzing. We'll call it fuzz. Next, run this profile and change a few things in about:config.

browser.shell.checkDefaultBrowser = false
browser.sessionstore.resume_from_crash = false
browser.startup.homepage = http://localhost:8111/

This assumes that we've setup our httpserver publisher in Peach to serve requests on http://localhost:8111/

Running the Fuzzer

Now we are ready to actually do our fuzzing.

A good way to do this is to use screen, with a couple of terminals running in the screen session. In one of the terminals we'll start Peach (which will be serving our fuzzed files), in another we'll watch the Peach logs, and in a third we'll start Firefox.

Here's a script to start Peach (and thus our fuzz data webserver):

#!/bin/bash

FUZZ_ROOT=/home/fuzzuser/peach
PYTHON=$FUZZ_ROOT/virtenv/bin/python
PEACH_PY=$FUZZ_ROOT/Peach-2.3.6/peach.py
PEACH_ARGS=""
#PEACH_ARGS="-1 --debug"
PEACH_XML=$FUZZ_ROOT/woff.xml
# This is different from the Peach log directory defined in the
# PEACH_XML file.
PEACH_OUTPUT_LOG=$FUZZ_ROOT/peach.output.log

if [ "`ps -ef | grep firefox-bin` | grep -v grep" == "" ]; then
    echo "Warning: firefox doesn't appear to be running."
    echo "Make sure you run firefox_fuzz_start.sh"
fi

echo "Logging peach.py output to $PEACH_OUTPUT_LOG"
echo "Running command: $PYTHON $PEACH_PY $PEACH_ARGS $PEACH_XML >$PEACH_OUTPUT_LOG 2>1"
$PYTHON $PEACH_PY $PEACH_ARGS $PEACH_XML >$PEACH_OUTPUT_LOG 2>&1

Here's a script to start Firefox:

#!/bin/bash

# Using the command "xvfb-run" will run firefox with a virtual framebuffer so
# that you don't have to use X forwarding.
XVFB=""
#XVFB="xvfb-run"

# Unfortunately there isn't any way to ask Valgrind to exit immediately
# when it detects a memory error. So we enable timestamp logging and
# hope we can correlate it back to which fuzz file is being used based
# on fuzzer logs. Another option would be to have 
#VALGRIND=""
VALGRIND="valgrind --trace-children=yes --time-stamp=yes"
VALGRIND="$VALGRIND --quiet"
VALGRIND="$VALGRIND --log-file=valgrind.log"
VALGRIND="$VALGRIND --error-exitcode=123"

# Have a core dump generated in this script's directory.
cd `dirname $0`
ulimit -c unlimited

FX_BIN_DIR=/home/fuzzuser/moz/mozilla-central/dist/bin
FX_BIN=firefox-bin
FX_ARGS="-no-remote -P fuzz"

export LIBRARY_PATH=$FX_BIN_DIR:$FX_BIN_DIR/components
export LD_LIBRARY_PATH=$FX_BIN_DIR:$FX_BIN_DIR/plugins

export NSPR_LOG_FILE=/home/fuzzuser/moz/fxlog.txt
export NSPR_LOG_MODULES=userfonts:5

$XVFB $VALGRIND $FX_BIN_DIR/$FX_BIN $FX_ARGS

If you're running remotely, then you'll either need to have ssh'd in with X forwarding (ssh -X) or use Xvfb to have a "fake" X server. If you use Xvfb, you'll at least want to run with X forwarding a little first to make sure everything is working like you expect.

To run with Xvfb:

sudo apt-get install xvfb

And set XVFB_COMMAND to "xvfb-run" in the Firefox start script.

For reference, this example fuzzing run took 8 hours when not run through Valgrind and is estimated to take 185 hours when running through valgrind.

Additional Thoughts

The above instructions have shown a very simple usage of Peach to fuzz Firefox. Other approaches could involve using Peach's ability to start a new process for every test to have it start a new Firefox process for each test, fixing the Linux debugger support or trying Peach on Windows/Mac which may have better debugger integration, generating static files that get fed to Firefox rather than the publisher running an HTTP server, and fuzzing against shared libraries rather than a running instance of Firefox.

This example also lacks notification of when the fuzzer has finished or encountered a "successful" fuzz case (found a bug). If fuzzing is long-lived, then one will probably want monitoring and notification if fuzzing stops. Possibly the easiest solution would be to have a separate script running that just monitors the peach and firefox processes and sends an email to notify of either one not running anymore.