User:Nnethercote

From MozillaWiki
Jump to: navigation, search

Docs on Bugzilla tags, and examples:


Tasks

XXX: this is a backup of a section from the Uptime wiki, prior to some changes. It'll be removed at some point.

  • Tasks are divided into numerous categories. Each category has one or both of (a) a meta-bug and a table showing the dependent bugs, and (b) a list.
  • Bugs indicate well-defined tasks.
  • List elements indicate (a) tasks that do not yet have a bug filed, (b) tasks that are not well specified or whose benefit is unclear (these often have a trailing '?'), and (c) broad topics best represented by a link to another page (such as a project tracking page).
  • The goal is for every category to end up with a meta-bug and table, and as few list elements as possible.
  • Not every bug that could be included in these tables should be. This is about tracking sizeable pieces of work, rather than enumerating every single related thing that has been done.
  • Some bugs are included in the tables in more than one section, because they are relevant to more than one section.
  • bug 1289677 is a top-level tracking bug that is blocked by all the category-level meta-bugs.

Crash rate tracking

This category is about the tracking of crash rates, using both crash report and telemetry data.

  • Switch from the per-100-ADI metric to the per-1000-hours metric
  • Better dashboards?
  • Track other metrics that may provide insight into how we can improve things?

Pre-release coverage

This category is about the usage and coverage (code, feature, site, hardware, and configuration coverage) of users on pre-release channels, especially Nightly.

  • bug 1280394 [meta] Increase the number of Nightly users [pascal, marcia]
  • Provide test exercises for QA people and keen Nightly users to run regularly [marcia]
  • Improve earlier channel populations to be more representative of the Release population [lonnen?]

Crash report creation

This category is about the creation and contents of raw crash reports as collected on the client.

  • Bugs blocking bug 1289663: [Uptime] Crash report creation
ID Priority Summary Keywords Assigned to
1250687 -- Consider MiniDumpWithPrivateWriteCopyMemory for Windows minidumps
1251395 -- Include non-JIT, executable, private pages in crash dumps
1277448 -- unify MozCrashReason and AbortMessage crash annotations (and use __func__ in them and for assertions) David Baron :dbaron: ⌚️UTC-7
1280469 P2 [meta] Client-side stack walking meta Gabriele Svelto [:gsvelto]
1286802 -- Add heap regions of the crash context to minidump (Windows) Cervantes Yu [:cyu] [:cervantes]
1295918 -- Include JS stacks in crash report
1295934 -- Add a crash report annotation when we hit DoneStartingUp Nicholas Nethercote [:njn]
1309573 -- Many places (including all of SpiderMonkey) cannot set the crash reason in crash reports Emanuel Hoogeveen [:ehoogeveen]
1334027 -- Add unloaded modules and process/thread data to minidumps Ting-Yu Chou [:ting] (parental leave ~ 12/31)
1337688 -- Remove NIGHTLY_BUILD wrapping if the increased size from adding unloaded modules and process/thread data to minidumps is acceptable
1351277 -- Add heap regions of the crash context to minidump (MacOS and Linux)

11 Total; 6 Open (54.55%); 5 Resolved (45.45%); 0 Verified (0%);

  • Include JS execution info (stacks? recent JS file/functions we called into? unresolved promises?)
  • Include more URL data (URLS of all open tabs? recent URLs in crashing tab? non-anonymized memory reports?); might require policy/legal approval
  • Don't omit frame pointers?
  • Better identify threads of all platforms, including threadpools

Crash report submission

This category is about increasing the crash report submission rate.

  • Bugs blocking bug 1289671: [Uptime] Crash report submission
ID Priority Summary Keywords Assigned to
1269998 P1 Prompt users with pending crash reports to submit them Brad Lassey [:blassey] (use needinfo?)
1270553 -- Allow users to opt-in to auto submitting crash data
1280469 P2 [meta] Client-side stack walking meta Gabriele Svelto [:gsvelto]
1287178 -- Refactor unsubmitted crash report handling and allow users to always send backlogged crash reports dev-doc-needed Mike Conley (:mconley) (:⚙️) - Backlogged on reviews and needinfos
1333125 P4 Improve HTTP proxy support in the crashreporter client and pingsender

5 Total; 3 Open (60%); 2 Resolved (40%); 0 Verified (0%);

  • Always submit non-sensitive data, and make only the sensitive data part (e.g. minidumps) optional

Crash report handling

This category is about how crash reports are processed, clustered, analyzed, and triaged once they are received by Socorro.

  • Bugs blocking bug 1289676: [Uptime] Crash report handling
ID Priority Summary Keywords Assigned to
974420 -- Addresses >128TB are displayed as 0xffffffffffffffff on crash-stats Ted Mielczarek [:ted.mielczarek]
977778 -- Allow users to request get-minidump-instructions report on-demand
1268029 -- Use jit classifier to change signature for jit crashes Adrian Gaudebert [:adrian]
1273657 -- [tracker] Publish public crash stats to the data platform Peter Bengtsson [:peterbe]
1274345 -- Add support for skipping a dll in the signature Adrian Gaudebert [:adrian]
1274428 -- Mark crashes that happen at invalid instruction pointers?
1274628 -- Annotate crashes when the code in memory around the crashing instruction differs from the code in the shipped binary
1277337 -- Use hg.mozilla.org to map crashes to bug components by way of source files when possible (Catching up emails) Kan-Ru Chen [:kanru] (UTC+8)
1291173 -- Show important info from memory reports in crash-stats Nicholas Nethercote [:njn]
1297966 -- Show the new "StartupCrash" annotation in the crash report page Marco Castelluccio [:marco]
1305888 -- Add the new CPU microcode annotation to SuperSearch Adrian Gaudebert [:adrian]
1306891 -- Integrate correlation results from https://mozilla.github.io/stab-crashes/correlations.html on crash-stats Marco Castelluccio [:marco]
1308474 -- Add the new StartupCrash annotation to SuperSearch Adrian Gaudebert [:adrian]
1308476 -- Replace the current heuristic for startup crashes by using the new StartupCrash annotation Adrian Gaudebert [:adrian]

14 Total; 5 Open (35.71%); 9 Resolved (64.29%); 0 Verified (0%);

  • Clouseau: Automatically identify changesets that cause regressions [calixte]
  • crash-correlations: Identify correlations for crash signatures [mcastellucio]
  • Provide ability to run custom analysis jobs on crash reports (similar to telemetry analysis jobs)

Note: a cross-variate analysis of FHR data, by Brendan Colloran, which may have useful techniques.

Crash cluster ranking

This category is about how each cluster of crash reports (e.g. those with the same signature) is prioritized, whether by frequency or other means.

  • Take into account crash severity as well as frequency
  • Use Crystal Ball (or other means) to identify how minor crashes on early release channels might become major crashes in later channels

Crash report comprehensibility

This category is about making the contents of crash reports easier to understand, via better presentation and documentation.

  • Bugs blocking bug 1289675: [Uptime] Crash report comprehensibility
ID Priority Summary Keywords Assigned to
1275799 -- Add descriptions to crash report fields Adrian Gaudebert [:adrian]
1288309 -- Improve documentation about individual crash reports Nicholas Nethercote [:njn]
1288310 -- Improve documentation about analyzing clusters of crash reports Nicholas Nethercote [:njn]

3 Total; 0 Open (0%); 3 Resolved (100%); 0 Verified (0%);

Fuzzing

This category is about fuzzing and similar automatic test exploration tools such as BugHunter.

  • bug 828452 Generate consistent signatures across crash-stats/fuzzing/BugHunter/automated tests Socorro API for generating signatures?
  • bug 1289194 Add LibFuzzer support for testing xul code
  • Better isolate components so they can be fuzzed more easily (like the JS shell)
  • Increase hardware available for fuzzing, for both greater throughput and better hardware configuration coverage
  • Increase fuzzing coverage of non-default options
  • Improve gtests, which are a good starting point for fuzzing
  • Run BugHunter with common antivirus software [tomcat]

Dynamic analysis

This category is about the use of dynamic analysis tools.

ID Priority Summary Keywords Assigned to
929478 -- Make TSan (ThreadSanitizer) usable with Firefox sec-want Julian Seward [:jseward] (pto Oct 16-20)
1030826 -- Support AddressSanitizer builds on Windows with clang-cl sec-want Away for a while
1280637 -- Add a TSan-enabled TaskCluster JS shell job Steve Fink [:sfink] [:s:]
1284975 -- [meta] Make SpiderMonkey clean on UBSan Terrence Cole [:terrence]
1288596 -- Make SM-tc(msan) a first class test job Steve Fink [:sfink] [:s:]
1288993 -- Run valgrind-mochitest twice a day as a Tier 2 job Joel Maher ( :jmaher) (UTC-5)
1289994 -- Use Application Verifier meta Cervantes Yu [:cyu] [:cervantes]
1291954 P3 Make SM(tsan) a tier 1 build leave-open, triage-deferred Steve Fink [:sfink] [:s:]

8 Total; 5 Open (62.5%); 3 Resolved (37.5%); 0 Verified (0%);

Static analysis

This category is about the use of static analysis tools.

ID Priority Summary Keywords Assigned to
1230156 -- [meta] Coverity Static Analysis fixes coverity, meta
1272513 -- Enable -Wshadow warnings leave-open Chris Peterson [:cpeterson]

2 Total; 2 Open (100%); 0 Resolved (0%); 0 Verified (0%);

  • Add more checks to the clang static analysis job [sledru, etc?]

Low-level defect prevention and detection

This category covers low-level changes we can make to the code to prevent entire classes defects, such as using smart pointers and compiler annotations, and also changes we can make to detect defects, such as adding assertions and internal consistency checks.

  • Bugs blocking bug 1289662: [Uptime] Low-level defect prevention and detection
ID Priority Summary Keywords Assigned to
1268766 -- Use MOZ_MUST_USE everywhere
1272203 -- Add mozilla::NotNull to MFBT Nicholas Nethercote [:njn]
1276097 -- Add a bytecode sanity check
1277368 P3 [meta] Use mozilla::Result<T, E> for fallible return values in the JS engine leave-open, meta, triage-deferred Jan de Mooij [:jandem]

4 Total; 3 Open (75%); 1 Resolved (25%); 0 Verified (0%);

  • Something like v8's --debug-heap, which checks the GC heap

High-level defect prevention

This category covers high-level changes we can make to the code to avoid entire classes of defects, such as architectural or language changes.

Defective software

This category is about actively tolerating or responding to defective software (OS, drivers).

  • Disable hardware acceleration in the presence of buggy gfx drivers
  • Handle gfx driver resets

Defective hardware

This category is about detecting and tolerating defective hardware: CPUs, memory, disks, etc.

  • Bugs blocking bug 1289666: [Uptime] Defective hardware
ID Priority Summary Keywords Assigned to
995652 -- Run memtest from the crash reporter
1270554 -- Run memtest continuously on the live browser
1274428 -- Mark crashes that happen at invalid instruction pointers?
1274628 -- Annotate crashes when the code in memory around the crashing instruction differs from the code in the shipped binary
1281759 P3 Work around mysterious AMD JIT crashes Jan de Mooij [:jandem]
1293188 P3 Crash in EnterBaseline (can be defective hardware [Mem or VGA]) crash, triage-deferred
1293996 P1 Crash in adapt_probs crash, regression Jean-Yves Avenard [:jya]
1317253 P5 Best-effort detection of faulty memory at page-request time

8 Total; 6 Open (75%); 2 Resolved (25%); 0 Verified (0%);

  • Detect if Firefox is mis-installed (e.g. perform checksums on files) and ask the user to reinstall.

Malware, etc.

This category is about preventing malware, anti-virus, and other third-party code from interfering with Firefox.

  • Windows 10 has better blocking, at least for content processes? (e.g. a DLL whitelist)
  • Reduce export space of xpcom symbols

OOMs

This category is about avoiding and tolerating OOM crashes.

ID Priority Summary Keywords Assigned to
1291068 -- Large-scale analysis of OOM crash reports with ContainsMemoryReport=1
1291173 -- Show important info from memory reports in crash-stats Nicholas Nethercote [:njn]
1299747 -- Create a tool to track 64k unaligned virtual memory allocation Ting-Yu Chou [:ting] (parental leave ~ 12/31)

3 Total; 0 Open (0%); 3 Resolved (100%); 0 Verified (0%);

  • Increase usage of 64-bit Firefox on Windows: Firefox/win64 [cpeterson, etc.]
  • Discuss common OOM cases with partners [harald?]
  • MemShrink