Manually Parsing Breakpad Symbol Files

From MozillaWiki
Jump to: navigation, search

Sometimes you wind up with a non-symbolic stack trace (like from Apple Crash Reporter) and would like to know the real stack trace. If the crash is from a nightly or release build, you are in luck, in that the information you need is contained in the Breakpad symbol files. However, you have to manually parse these files. Hopefully someone will turn this awful process into a script at some point.

  • Figure out what build you have. You'll need to know the product version and full build id so you can figure out what symbols to use.
  • Load the symbol index file from the symbol server. The URL looks like this:${product}/${product}-${version}-${OS}-${BuildID}-symbols.txt

Where ${OS} is one of WINNT, Darwin, or Linux, and the other parameters are fairly obvious. You should get a URL similar to the following:
  • Load the index file. This file contains a list of the actual symbol files for this build.
  • Determine what module you're in. For example, if your stack says libmozjs.dylib, you'll need to load libmozjs.dylib.sym. Note that for OS X Universal builds there are two symbol files per module, one per CPU architcture. (Hint: PPC is usually listed first.)
  • Load that symbol file, the URL will look similar to the following:
  • This file is formatted like so:
    • a MODULE line (which you can ignore)
    • a bunch of FILE lines
    • (maybe) some PUBLIC lines
    • a bunch of FUNC lines, and some gibberish looking stuff in between them. These are what you really care about.
  • Determine what relative virtual address you're at. If you have an Apple Crash Reporter stack, each line will give you an absolute memory address. You'll need to subtract out the base address of the loaded module. For example, given this line in a Crash Reporter stack:
 1   libmozjs.dylib                	0x00117524 JS_FloorLog2 + 24090\

The absolute address here is 0x00117524, but if you look down further in the crash report, you'll see the "Binary Images" list, which will show you the base address of each module. In this crash report it looks like so:

  0xcc000 -   0x184ffc +libmozjs.dylib ??? (???) <0e69a4396c35b15d18a2bac4cbc4fdf3> /Applications/Internet/Browsers/\

This indicates that the base address of libmozjs.dylib is 0xcc000. Subtract that from the address above, and you get 0x4B524, the relative virtual address where the instruction pointer was at this frame.

  • Now, look that RVA up in the breakpad symbol file. Those FUNC lines I mentioned before? Each one is formatted like so:
 FUNC rva length something function_name

and the gibberish between them is line number info, formatted like so:

 rva length line_number file_number

What this means is that given an rva, you need to find the line info line that contains the address just below your rva. You may find grep ^xxx to be useful here, where xxx are the first three hex digits of your rva. In our example, we find this line in the symbol file:

 4b50b 21 418 23

This line contains our rva, as 4b50b + 21 == 0x4B52C. Scrolling up in the file, we see that the last FUNC line before this line is:

 FUNC 4b2e0 2ca 0 MarkSharpObjects(JSContext*, JSObject*, JSIdArray**)

so this address is in MarkSharpObjects. Going one step further, we can map the line number info to get the exact line where the instruction was. The line info we hit ended with "418 23", which indicates line 418 in file 23. At the top of the file, you can easily find the line starting with "FILE 23". In our case this is:

 FILE 23

Given this, you can form the hgweb URL to the source line: