Crash reporting overhaul: Difference between revisions

Jump to navigation Jump to search
Moved the sections that were improvements rather than rewrites to their own page
(Added information about the crash monitor project)
(Moved the sections that were improvements rather than rewrites to their own page)
Line 63: Line 63:
# We will enable the crash monitor to launch the crash reporter client in case of main process crashes.
# We will enable the crash monitor to launch the crash reporter client in case of main process crashes.
# Finally we will add a mechanism to launch the crash monitor as soon as possible during startup, this should happen before any exception handlers are registered or possibly lazily by the exception handlers themselves.
# Finally we will add a mechanism to launch the crash monitor as soon as possible during startup, this should happen before any exception handlers are registered or possibly lazily by the exception handlers themselves.
== Minidump storage for crash annotations ==
Status: not started<br>
Developer(s):<br>
Source code:<br>
Original source code:<br>
* https://hg.mozilla.org/mozilla-central/file/6f0a8dddad51/toolkit/crashreporter/
Bugs:<br>
* {{bug|1759682}}
=== Description ===
Crash annotations are a set of pieces of information that accompany a
minidump to form a complete crash report. Crash annotations contain critical
information such as the Firefox version and build ID but also ancillary
information such as how much memory a process was using, or a user-provided
string associate with a failed assertion that crashed the process.
Currently crash annotations are stored in a JSON file (with an .extra suffix)
that is sent along with the minidump to Socorro. Depending on the type of
crash this file is either written out by the exception handler (if the main
process crashed) or the contents of the annotations are forwarded to the main
process which then writes them out (in the case of a child process crash).
=== Rationale ===
There are several issues with the current system:
* Having a separate file adds significant complexity both when submitting and processing crash reports, and also additional failure modes (like only one of the files being present in the report)
* The file needs to be written out after the minidump has been written out, adding complexity to the exception handler
* For child processes an extra IPC channel is needed to send the annotations
* Setting annotations is a relatively expensive process
* Some annotations are synthesized at crash time and dealt with ad-hoc code, there is no unified mechanism to handle them together with the others
Given the above storing the annotations within a minidump would simplify the crash reporting flow, eliminate an additional IPC channel and greatly streamline the effort to store annotations by user code.
=== Plan ===
Annotations should be stored within the minidump and read directly from the
crashed process. This requires several steps:
* The crash annotations interface in Gecko needs to be modified so that a process can flag where its annotations are stored
* The crash-time annotations need to be removed and replaced with regular ones
* We need to add a mechanism to separate between the process' annotations and global ones that must be included in every crash
* Minidump writers need to be modified to identify where the annotations are stored in a process memory, read them and write them out within the minidump
* Finally teach the stackwalker tool to look for the annotations in the minidump and print them out
Additionally some changes will be required to Socorro on the ingestion side. Socorro currently relies on the .extra file contents for filtering. For example annotations containing the product version are used to decide if a crash is coming from a version of Firefox that is very old and thus should be dropped. If we store the annotations within the minidump we need to provide a way for Socorro to extract them without processing the full minidump, so that it can still apply its filtering rules. To this end we need to write a streamlined minidump pre-processor that only extracts this information and provides it in JSON format. This might prove useful for other types of filtering we don't currently do (such as rejecting reports caused by hardware faults or unconditionally accepting those that might indicate security-sensitive issues). The rust-minidump crate provides all the necessary functionality to write this tool.


== Crash reporter client ==
== Crash reporter client ==
Line 421: Line 374:
[https://blog.mozilla.org/nnethercote/2020/04/15/better-stack-fixing-for-firefox/]
[https://blog.mozilla.org/nnethercote/2020/04/15/better-stack-fixing-for-firefox/]
describing his approach and results.
describing his approach and results.
== Telemetry-based dashboards ==
=== Overview ===
Status: not started<br>
Developer(s):
Source code:
Original source code:<br>
* https://mathies.com/mozilla/crashes/
=== Description ===
=== Rationale ===
=== Plan ===
Confirmed users
424

edits

Navigation menu