Firefox Core Engineering: Difference between revisions

Jump to navigation Jump to search
updated team, status of current activities
m (adam's nick)
(updated team, status of current activities)
Line 6: Line 6:
The purpose of this team is to address needs that fall between Toolkit and Platform, with an emphasis (currently) on improving stability, quality, and performance – supported by empirical data. As such, we overlap a bit with everyone from Platform through Toolkit, Data, and more.
The purpose of this team is to address needs that fall between Toolkit and Platform, with an emphasis (currently) on improving stability, quality, and performance – supported by empirical data. As such, we overlap a bit with everyone from Platform through Toolkit, Data, and more.


This team grew out of, in part, the Performance Engineering team, and owns that team's previous infrastructure – performance-related dashboards on telemetry.mozilla.org, the symbolication server, and more. It also includes the install & update team.
This team grew out of, in part, the Performance Engineering team, and owns that team's previous infrastructure – performance-related dashboards on telemetry.mozilla.org, the symbolication server, and more. It also includes the installer & updater applications.


== Personnel ==
== Personnel ==
Line 17: Line 17:
* Robert Strong (:rstrong)
* Robert Strong (:rstrong)
* Gabriele Svelto (:gsvelto)
* Gabriele Svelto (:gsvelto)
* Doug Thayer (:dthayer)
* David Durst (:ddurst)
* David Durst (:ddurst)


Line 58: Line 59:
* '''ChromeHangs:''' jquery csv issue resolved, backfilled data from 3/07 to 6/05
* '''ChromeHangs:''' jquery csv issue resolved, backfilled data from 3/07 to 6/05
* '''Update Orphaning:''' functional
* '''Update Orphaning:''' functional
* '''Stability Dashboard:''' functional


=== symbolapi.mozilla.org ===
=== symbolapi.mozilla.org ===
This is the [[Snappy_Symbolication_Server|symbolication server]] (aka "Snappy Symbolication Server") used by platform developers and performance dashboards. It is '''not''' used for the analogous process on Socorro.
This is the [[Snappy_Symbolication_Server|symbolication server]] (aka "[https://github.com/mozilla/Snappy-Symbolication-Server Snappy Symbolication Server]") used by platform developers and performance dashboards. It is '''not''' used for the analogous process on Socorro.


== Historical knowledge areas ==
== Historical knowledge areas ==
Line 76: Line 78:
We need to reduce known blind spots and barriers to getting data. For this, our goals are to:
We need to reduce known blind spots and barriers to getting data. For this, our goals are to:
* enable client-side stackwalking and send basic stack traces with crash pings (beginning in Nightly/Aurora, see {{Bugzilla|1280469}})
* enable client-side stackwalking and send basic stack traces with crash pings (beginning in Nightly/Aurora, see {{Bugzilla|1280469}})
* enable content process crash reports ({{Bugzilla|1293656}})
* <strike>enable content process crash reports ({{Bugzilla|1293656}})</strike>
* differentiate between process types in crash pings ({{Bugzilla|1310664}})
* <strike>differentiate between process types in crash pings ({{Bugzilla|1310664}})</strike>
* process stack data in crash pings into a queryable result ({{Bugzilla|1310695}})
* process stack data in crash pings into a queryable result ({{Bugzilla|1310695}})
* create CrashSender to handle crash pings instead of Gecko ({{Bugzilla|1310703}})
* create CrashSender to handle crash pings instead of Gecko ({{Bugzilla|1310703}})
* enable client-side stackwalking and send basic stack traces with crash pings on Beta/GA
* enable client-side stackwalking and send basic stack traces with crash pings on Beta/GA
See the [[Firefox_Core_Engineering/Get_More_Data_Faster|roadmap here]].


==== Stability Dashboard for Relman ====
==== Stability Dashboard for Relman ====
Relman has been using arewestableyet.com and related graphs to understand stability by build and channel; this is fine, but it relies on ADI and crash-stats rather than telemetry, and this is known to be unreliable. For this, our goals are to:
Relman has been using arewestableyet.com and related graphs to understand stability by build and channel; this is fine, but it relies on ADI and crash-stats rather than telemetry, and this is known to be unreliable. For this, our goals are to:
* create a dashboard like arewestableyet.com, but based on telemetry ({{Bugzilla|1297146}})
* <strike>create a dashboard like arewestableyet.com, but based on telemetry ({{Bugzilla|1297146}})</strike> '''Stability dashboard''': https://telemetry.mozilla.org/crashes/
* establish confidence levels based on kilousagehours by comparing telemetry-based stability data with ADI-based stability data
* establish confidence levels based on kilousagehours by comparing telemetry-based stability data with ADI-based stability data


==== Stabilize symbolapi.m.o ====
==== Stabilize symbolapi.m.o ====
The symbolication API service is used by platform developers for debugging. It may also be used as part of the processing step for stacks received via crash pings. But there have historically been issues with its performance ({{Bugzilla|1244589}}). Stabilizing this means:
The symbolication API service is used by platform developers for debugging. It may also be used as part of the processing step for stacks received via crash pings. But there have historically been issues with its performance ({{Bugzilla|1244589}}). Stabilizing this means:
* rewrite symbolapi.m.o, adding tests and fixing caching (DONE)
* <strike>rewrite symbolapi.m.o, adding tests and fixing caching</strike>
* load test rewrite to ensure it improves on current uptime and load handling (PENDING)
* load test rewrite to ensure it improves on current uptime and load handling (PENDING)
* coordinate with Ops to set up regular deployment process and transfer ownership (PENDING)
* coordinate with Ops to set up regular deployment process and transfer ownership (PENDING)
Line 108: Line 112:
XUL is supposed to go away, but it would seem that we don't know what the performance implications could/will be. This work builds on Neil Deakin's 2015 experiment to shed some light on where we need to focus our optimization/change efforts.
XUL is supposed to go away, but it would seem that we don't know what the performance implications could/will be. This work builds on Neil Deakin's 2015 experiment to shed some light on where we need to focus our optimization/change efforts.


==== Orphan remediation ====
==== Updater and Orphan remediation ====
Remediation efforts have been tested for both system add-on capable and non (44.x and 43.0.1, respectively). Analysis thus far confirms the reach but not the effectiveness or rate of conversion that we'd hoped for. This means:
Remediation efforts have been tested for both system add-on capable and non (44.x and 43.0.1, respectively). Analysis thus far confirms the reach but not the effectiveness or rate of conversion that we'd hoped for. This means:
* continue the download instead of starting over after NS_ERROR_DOCUMENT_NOT_CACHED occurs (already fixed in Firefox 49) ({{Bugzilla|1272585}})
* continue the download instead of starting over after NS_ERROR_DOCUMENT_NOT_CACHED occurs (already fixed in Firefox 49) ({{Bugzilla|1272585}})
* continue the download instead of starting over after other networking errors occur ({{Bugzilla|1309124}})
* continue the download instead of starting over after other networking errors occur ({{Bugzilla|1309124}})
* download the update MAR file unthrottled (already landed) ({{Bugzilla|1309125}}, {{Bugzilla|1309668}})
* <strike>download the update MAR file unthrottled (already landed) ({{Bugzilla|1309125}}, {{Bugzilla|1309668}})</strike>
* serve a partial MAR file to Firefox 43.0.1 clients ({{Bugzilla|1309130}})
* <strike>serve a partial MAR file to Firefox 43.0.1 clients ({{Bugzilla|1309130}})</strike>
* push either a system or hotfix add-on that changed the download throttle preference to 0
* push either a system or hotfix add-on that changed the download throttle preference to 0
* run another method (non sysaddon, non SHIELD?, etc) to urge 43.0.1 users to upgrade
* change compression to LZMA for updates ({{Bugzilla|641212}})
* change compression to LZMA for updates ({{Bugzilla|641212}})
* run another method (non sysaddon, non SHIELD?, etc) to urge 43.0.1 users to upgrade


==== Install UI ====
==== Install UI ====
The install UI is outdated (and too big) and needs to be updated.  
* The install UI is outdated (and too big) and needs to be updated. ({{Bugzilla|893505}})
* most recent mockups: https://mozilla.invisionapp.com/share/Y776FIBWS#/screens
** most recent mockups: https://mozilla.invisionapp.com/share/Y776FIBWS#/screens


==== Windows 64 ====
==== Windows 64 ====
We want to start moving users to 64-bit when appropriate:
We want to start moving users to 64-bit when appropriate:
* stub installer should automatically select 32-bit or 64-bit ({{Bugzilla|797208}})
* <strike>stub installer should automatically select 32-bit or 64-bit ({{Bugzilla|797208}})</strike>




=== Current projects ===
=== Current projects ===
==== 2016 Q4 goals ====
==== 2016 Q4 goals ====
* landing of client-side stackwalking
* landing of client-side stackwalking (DONE)
* create separate content process crash pings
* create separate content process crash pings (DONE)
* start querying stacks received from crash pings
* start querying stacks received from crash pings (IN PROGRESS)
* relaunch of symbolapi.m.o -- now with tests and safe cache management
* relaunch of symbolapi.m.o -- now with tests and safe cache management (IN QA)
* completion of definition phase of Flash-blocking & UI project
* completion of definition phase of Flash-blocking & UI project (IN PROGRESS)
* LZMA compression for updates
* LZMA compression for updates (IN REVISION)
* updated Install UI
* updated Install UI (IN PROGRESS)
* standardize orphan remediation process with respect to GA release cycle
* standardize orphan remediation process with respect to GA release cycle (IN ANALYSIS)


== Potential future projects ==
== Potential future projects ==
This list should be considered a work in progress. Decisions will be reflected for a particular quarter.
This list should be considered a work in progress. Decisions will be reflected for a particular quarter.
* Profiling WebExtensions (via dev tools)
* Assisting with measuring (and addressing) jank and hang
* Assisting with measuring (and addressing) jank and hang


Confirmed users
746

edits

Navigation menu