Performance/2014-02-13

From MozillaWiki
Jump to: navigation, search

Performance Monthly Review





Thursday, Feb 13, 2013 Agenda

 

A review of all in-process projects.



Incremental Cycle Collection (ICC) - mccr8 (bug 850065)

ETA: landed preffed off by end of year - Firefox 30+

Status:

  • It landed preffed off by the end of 2013.  dom.cycle_collector.incremental

  • I'm working on getting it enabled in Q1. (bug 911246)

  • The main problem is not running out of memory while Mochitesting.



Keep JS-accessible APIs from blocking the main thread - yoric

Note, this project has been broken up into sub-projects below.



Non-blocking screen capture for b2g, thumbnails, swipe history animation - yoric, layout team

ETA: [waiting for an ETA from roc]

Status: No progress, still waiting.



Session Restore Refactoring - yoric, ttaubert, smacleod, billm

Big story: we finally have improvements, waiting for additional Telemetry to decide in which direction we head now. In other news: fought big startup regression, need Talos.

lots of work from last year has come together.

completely e10s compatible

metrics found a regression due to workers, switched approaches but still investigating but thinking is current regression 20-30ms is much better than previous regression



1. caching cookies (expected Q4)

ETA: Landed.

Status: Landed. Needs verification.

2. caching communication

ETA: Q1?

Status: [waiting for Telemetry to determine whether it's useful].

3. counter-measures against Facebook's misuse of history

ETA: Q1

Status: Done.

5. writing session restore less often while on battery

ETA: Q1

Status: Waiting.

6. removing rewrite upon startup

ETA: hopefully, Q4.

Status: Done.

8. collecting data piece-wise

ETA: Q4 for the main course, Q1 for the followup

Status: Done.

9. Clean up garbage from sessionstore.js

ETA: Q1 - Q2

Status: Started, first pieces landed.

10. Telemetry on sessionstore.js itself

ETA: Q4 - Q1

Status: Done.

11. Compress sessionstore.js

ETA: Q1

Status: Not started.

12. Investigating/counteracting large startup regression in FF25

ETA: Q1

Status: Actively working on it

13. Collecting less DOMSessionStorage data

ETA: Q1

Status: Landed.

14. Moving towards broadcasting data and e10s compatibility

ETA: Q1

Status: Landed.

15. Telemetry for Session Restore

ETA: Q1

Status: Prototype.



Making workers more useful - yoric

Big story: Chrome Workers are very slow to launch. Will need plenty of work.



1. lz4 [de]compression

ETA: Q4

Status: landed

2. sqlite access on workers

ETA: Q4

Status: landed

3. Various optimizations to make OS.File faster to launch

ETA: Q1

Landed

4. Investigating Chrome Worker launch speed

ETA: Q1 

Landed, depressing...but still in progress



Removing main thread I/O - yoric

1. NetUtil.asyncCopy / nsIAsyncStreamCopier

ETA: (first part landed, more to come, probably in 2014)

Status: Not very active atm.

2. Native implementation of OS.File.read

ETA: Q1

Status: Working prototype

3. Getting rid of Telemetry main thread I/O - this is most of what has been done recently on this topic

ETA: Q1

Status: landed

4. Convert Storage connections to mozIStorageAsyncConnection and Sqlite.jsm

ETA: tbd

Status: on hold

5. Journaled JSON

ETA: Q2

Status: drafting





Profiler Backend for Mobile - jseward

ETA: Q1 for minimum landable functionality (== CFI+EXIDX+stack-scan on x86_64-linux and arm-android)

Status: 

  • Main bug: 938157

  • CFI+EXIDX+stack-scan for x64-linux, x32-linux and arm-android work and are known to give good quality stack traces, through both our code and system libraries.

  • Performance on arm-android at least a factor of 4 better than with the old breakpad unwinder.  Achieves 1KHz sampling at around 40% of an A9 core at 1GHz, for an average unwind length of around 30 frames -- back to XRE_Main.

  • Currently in review w/ njn and froydnj.



Browser responsiveness benchmark - Phase 1 - vladan, avih, jmaher

ETA: Q2

Status:

  • Related bug 938644: Google's web latency benchmark being added to talos

  • toolchain has been modifed to be automated

  • working on issues with tools/setup on our test slaves

  • need to build benchmark in 32 bit mode

  • - Needs review and assessment



Talos - Australis Customize animation test (CART) - avih, mconley

- ETA: Q1 (awaits landing)

Status:

  • - TART is now reasonably generalized to support different animations

  • - Performance numbers are not good (e.g 800ms instead of 150).

  • This is a test for the Australis customize mode (Hamburger -> Customize)



Talos: Examine deployment of low-end/common HW - avih, jmaher

ETA: Q2

Status:

  • Not started



OMTC windows perf (nrc/benwa/bas) - avih

-Help with talos results/interpretations.

- BenWa says OMTC on Windows currently regresses performance somewhat, he says there is no low hanging fruit, he thinks tiling will solve the performance issue



Perf mode removal - smaug (help: tn, mstange, avih):

ETA: Q1/2 (landed, backed out from the train)

- Remove a mode where Gecko events handled instead of Native (like mouse or paint)

- probably impacts all products

- Bug 930793



Addon Manager - irving (additional add-on manager perf fixes)

ETA: Q1

Status:

  • Reduce impact of add-on compatibility check during Firefox updates: WIP on bug 772484, bug 760356



Network Cache Rewrite - jduell

ETA:  Q1 - Firefox 30

Status:  

  • HTTP Cache rewrite status:  we're planning to re-land the new cache (minus the in-memory index and eviction) on mozilla-inbound, for a test run. We're still planning to land the full cache within a week or two.  Knock on wood!



Network predictor (seer): jduell

ETA: Q1 - Firefox 30

  • Database that predicts subresources/domains needed to load HTML page

  • Big win when it's right, but database takes lots of MB and I/O.

  • Currently trying to find right amount of data to store (and sqlite optimization)



Complete asynchronous history API - Marco Bonardo, Asaf Romano

ETA: tbd (depending on new Firefox Desktop process)

Status:  

  • bug 937560 - introduce onDeletePages batch notification - Mano working on a final patch for review

  • Bug 891303 - Async-friendly transaction manager for Places - on hold

  • Bug 834545 - Add new async removePlaces API in mozIAsyncHistory - on hold



Reduce impact of bookmarks backups - Marco Bonardo, Raymond Lee

ETA: Firefox 30 (partial on 29)

Status:  

  • bug 824433 - faster backups are in Nightly, about to uplift to Aurora

  • bug 968177 - reuse backups code for html exports - patch pending review

  • bug 967192 - OS.File usage in all of the import/export code - patch ready, tests conversion in act

  • bug 818584 - don't store duplicate backups (compare hashes) - has an old patch, needs an unbitrot

  • bug 818587 - lz4 compress backups - experimental patch working

  • Telemetry shows 90% of users <100ms, still room for improvement (later)

  • converting to OS.file

  • nothing of this happens on shutdown anymore



Improve Text Performance- jet

ETA: 

Status:

Fixed...

* Bug 962440 - async font loader: landed the initial implementation

In Progress...

* Bug 752394 - eliminate need to enumerate fonts for localized and postscript names

* Bug 967292 - [Contacts] nsDiplayText overhead is too high      



Note: Jet - let's list out the near term projects from your etherpad separately for review. https://etherpad.mozilla.org/textperfworkitems



Reduce impact of thumbnailing - Drew Willicox, Mark Hammond 

ETA: Done, except for https://bugzilla.mozilla.org/show_bug.cgi?id=809056, which has patches.

Status: 



Fennec: Canvas Perf, SkiaGL - James Willcox

ETA: 

Status:



Fennec: Canvas, Checkerboarding, Page Load, Startup Time - Geoff Brown

ETA:

Status:



Fennec: ANR, BHR - Jim Chen 

ANR (App Not Responding) Reporting

  • Status:

  • Just landed native stack unwinding



BHR (Background Hang Reporting)

  • ETA: (tentative) Telemetry backend and dashboard by Q1

  • Status:

  • Ongoing work to make the data useful for devs



Firefox OS - Mike Lee

Status:

  • Power

  • All requested harnesses created and delivered.

  • Automation has setup in place with Hamachi; we're working with them to define what to capture and where to report.

  • Initial baselines

  • Idle power usage is better than Android

  • Frame Uniformity

  • Working with LG to analyze and improve frame rate consistency.

  • Proposed using scrollgraph tool to analyze.

  • Memory

  • Converted all vorbis files to opus to save RAM; in some cases saving ~800KB.

  • Part of team in Taipei for mini-workweek focused on Tarako (128 MB device)

  • RIL worker using too much memory





Telemetry server - mreid, jonasfj

Status:

  • Histogram Dashboards @ telemetry.mozilla.org are currently stable

  • Dashboard has a twitter notification widget if/when there are noteworthy events

  • New Dashboards:

  • jchen's ANR+BHR above

  • Server infrastructure is stabilized (AWS CloudFormation)

  • We did not observe the expected "large increase" in the # of pings submitted per Bug 863872

  • Work underway to detect and alert on unusual telemetry submission patterns (per channel). This will automatically detect and notify about future cases where the submission rate drops in production (as in bug 962153) [:trink]

  • Basic infra for running scheuled telemetry jobs is in place (and running ANR, SlowSQL, Flash Version data exports)

  • Coordinator API v1 - list files, create analysis jobs

  • New preference for ignoring data points with small numbers of submissions

  • See instructions on rolling out your own custom dashboard and doing ad-hoc analyses at the bottom of this document



Diagnose and report desktop power usage - rvitillo

ETA: 29

Status:

  • I am nearly done tracking down the low hanging fruits in Alexa’s top 100; in the meantime jmaher is setting up a cluster which will collect power usages on many more websites and upload the data on datazilla.

  • I recently extended the power benchmarks to collect more information like the number of cycles spent in the different c-states, GPU power usage and number of wake ups.

  • Once we have reasonable picture of the power bottlenecks during idle, we will proceed to investigate the power profile of FF while scrolling on different websites vs different browsers.



Profiler Improvements - vstanchev, bgirard, aklotz

Status:

  • Several cleopatra performance improvements including port to flex layout and dealing better with large profiles, lots of markers and paint markers

  • New cleopatra features: Add comments, better sample highlighting

  • Cleopatra bug fixes: Stack highlights, wrong sample highlighting logic, fix sample layout when resolution is <1ms, normalize timelight heights

  • Profiler Addon: Changed hotkeys, tweak start/stop buttons to reduce confusion

  • Main thread I/O data is now being fed into Telemetry for everyone on Nightly

  • Underway: Don't sample sleeping threads, timer based markers, file name with IO reporting, correlating power harness with profile data, improve talos profiling



Project Proposals / Areas that need work

A review of new project ideas.



Project Area



Other Notable Fixes

A list of other notable perf related bug fixes that don't fit within any of the projects defined above.



Deploying your own custom dashboard for probes relevant to your team: http://jonasfj.dk/blog/2014/01/custom-telemetry-dashboards/

You can also do custom analyses of Telemetry data directly on our servers: http://jonasfj.dk/blog/2013/11/telemetry-rebooted-analysis-future/





THE ASYNC TEAM



* The Async Team

  David Reichenbach-Teller

  Marco Bonardo

  Felipe Gomes

  Tim Taubert

  Asaf Romano

  Paolo Amadini



* Async Team Goals

  1. Create and support tools that allow parallelism and background execution

     - Examples are OS.File and Promises

     - Work with other teams as necessary

  2. Convert existing code to leverage parallelism (with scoped projects)

     - Examples are Session Restore, Bookmarks, and Login Manager



* Async Whiteboard Tags

  - Put [Async] on bugs that you think can be in the team scope.

    We'll triage the list and update the tag to:

    - [Async:team] if we plan to work on the bug

    - [Async:ready] if we can provide support or mentorship

  - We'll also mark bugs with [Async:blocker] to indicate

    work from other teams that we're actively tracking



* Async Meetings

  - Every two or three weeks

    - Ensure that we have triaged all [Async] bugs

    - Collect progress reports and summarize the work done

    - Discuss feedback we received and general direction

    - Review which [Async:team] bugs we will work on next