Performance/2013-12-05

Performance Review - Thursday, Dec 5, 2013

Active Projects - https://wiki.mozilla.org/Performance#Active

A review of all in-process projects.

Incremental Cycle Collection (ICC) - mccr8 (bug 850065)

ETA: landed preffed off by end of year - Firefox 30+
Status:

  • Finished landing refactorings to prepare for ICC
  • I'm working on investigating some issues with legacy classes, getting reviews

Keep JS-accessible APIs from blocking the main thread - yoric

Note, this project has been broken up into sub-projects below.

Non-blocking screen capture for b2g, thumbnails, swipe history animation - yoric, layout team

ETA: [waiting for an ETA from roc]
Status: [waiting for an update from roc]
Does this overlap with the reduce impact of thumbnailing project (Drew and Mark): Yes
In terms of who's working on what, though, not really -- the reduce impact of thumbnailing project is basically a front-end project, and non-blocking capture sounds more like a back-end graphics project. But non-blocking screen captures will certainly benefit front-end callers.

Async error reporting - yoric

ETA: complete
Status:

Session Restore Refactoring - yoric, ttaubert, smacleod, billm

1. caching cookies (expected Q4)
ETA: Q4
Status: in progress
2. caching communication
ETA: Q4 Q1
Status: landing final blockers
3. counter-measures against Facebook's misuse of history
ETA: Q1
Status:

  • Facebook fixed on their side
  • We have designed a first set of countermeasures, which will be implemented during Q1
  • Found interesting bugs while investigating the issue, which contribute to the problem

4. removing synchronous fallback
ETA: Q4
Status: Done
5. writing session restore less often while on battery
ETA: Q4
Status: Waiting for contributor to wakeup.
6. removing rewrite upon startup
ETA: hopefully, Q4
Status: Waiting for contributor to land blocker.
7. making collection e10s compliant
ETA: complete
Status:
8. collecting data piece-wise
ETA: Q4 for the main course, Q1 for the followup
Status: Q4 part has landed, Q1 part in progress
9. Clean up garbage from sessionstore.js
ETA: Q1 - Q2
Status: Designing, experimenting, gathering feedback.
10. Telemetry on sessionstore.js itself
ETA: Q4 - Q1
Status: First part landed, second part in progress.
11. Compress sessionstore.js
ETA: Q1
Status: Not started

Making workers more useful - yoric

1. lz4 [de]compression
ETA: Q4
Status: Weird oranges.
2. sqlite access on workers
ETA: Q4
Status: Waiting for (final?) review.
3. reducing memory usage of OS.File
ETA: complete
Status: Reduced by 0.9Mb the memory use of OS.File on Desktop (non-scientific measurements), reduced by ~1.8Mb on B2G.
4. a number of new OS.File APIs
ETA: complete
Status:

Removing main thread I/O - yoric

1. URL Classifier
ETA: complete
Status:
2. NetUtil.asyncCopy / nsIAsyncStreamCopier
ETA: (first part landed, more to come, probably in 2014)
Status:

Profiler Backend for Mobile - jseward

ETA: Q4 for minimum landable functionality (== CFI+EXIDX+stack-scan on x86_64-linux and arm-android)
Status: New fast CFI/stack-scan unwind library exists, for x86_64-linux and arm-android. x86_64-linux initial results show a 20x unwind performance speedup compared with Breakpad (circa 310 insns/frame vs 6600 insns/frame). Integration with SPS: x86_64-desktop works, ARM is in progress. Tracking bug 938157.

Reducing time to first paint - Phase 1 - vladan

ETA: Q1
Status: No progress this month

Browser responsiveness benchmark - Phase 1 - vladan, avih, jmaher

(done Australis investigations, maintenance, talos, TART, OMTC)
ETA: Q1
Status:

Addon Manager - irving (additional add-on manager perf fixes)

ETA: End of 2013 - Firefox 29
Status:

  1. Recursive directory scan: Telemetry shows that a number of externally installed add-ons are modifying files that we cache in the StartupCache (mostly antivirus toolbars and search hijack "monetization" add-ons). We're looking for alternative solutions that reduce the startup overhead without risking broken profiles due to stale cached code for these add-ons.
  2. Analyze telemetry; results for addons that appear in more than ~0.5% of Windows Nightly profiles at https://docs.google.com/a/mozilla.com/spreadsheet/ccc?key=0AkQufLnofVhpdHJ4R0ZNZDJDUWtqREM2QjJ1SUh4Znc&usp=drive_web#gid=2
    • add-on manager start up telemetry pointed at an issue on Android; landed bug 944006 and will watch telemetry to see if we get a noticable improvement
    • A few notable add-ons are installed unpacked and have many files; approaching those authors to switch to packed add-ons may give us some improvement
    • Some bootstrap add-ons have particularly slow startup and shutdown; profiling and planning to contact developers to see if we can improve them
  3. Make it possible to cancel the add-on compatibility check (bug 772484): Supporting patch landed, main patch in progress (but not getting a lot of attention relative to other tasks)

Network Cache Rewrite - jduell

ETA: Q1 - Firefox 31
Status: planning to land in Q1: plan to land just after Firefox 30 fork (so in Firefox 31)

  • writing a crash-recoverable in-RAM index is taking more time than expected.

Bonus networking bug: (just discovered, expect it will be resolved quickly): bug 945779 - seer thread may be taking upwards of 15% CPU during pageload

Complete asynchronous history API - Marco Bonardo, Asaf Romano

ETA:
Status:

Reduce impact of bookmarks backups - Marco Bonardo, Raymond Lee

ETA:
Status:

Reduce Storage connections main-thread operations - Marco Bonardo, David Teller

ETA:
Status:

Improve Text Performance- jet

Note: Jet - let's list out the near term projects from your etherpad separately for review. https://etherpad.mozilla.org/textperfworkitems
ETA:
Status:

  • bug 941470 - testing and review of harfbuzz update
    Used textbench to compare existing code to update. Need to work on being able to test harfbuzz better, the word cache introduces variability into the results.
  • bug 934770 - Still need to figure out the right approach for splitting text runs in a way that hooks into the existing reflow interruption mechanism.
  • Working on font loading related bugs before the break. Based on Vladan's data and the Telemetry data, it looks like the name loading routines often block bug 859558#c5 and bug 752394#c2. The Telemetry data indicates InitFaceNameLists is often hit. Looking over the code, there are some easy, small changes we can make:
    • bug 947025 - timelimit RunLoader passes to 100ms - Patch landed.
    • bug 947812 - use DirectWrite API to fetch postscript/fullname rather than reading tables directly. - Patch under review.
  • The only way to really avoid all chrome hangs is to use an approach that falls back in situations where font enumeration takes too long. I'm leaning towards a process of falling back, loading font data on a separate thread and then restyling after the load thread completes.
  • bug 752394 - async loading of fullname/postscript data when sync fetch takes too long. Sketched out what I want to do, working on patch is next up. Tricky part is to avoid having to make large chunks of gfx font code thread-safe.

Reduce impact of thumbnailing - Drew Willicox, Mark Hammond

ETA: Soon...
Status: Done, except for https://bugzilla.mozilla.org/show_bug.cgi?id=809056, which has patches.
impacts desktop

Fennec: Canvas Perf, SkiaGL - James Willcox

ETA: For Fx25, we introduced Skia GL for NVIDIA
Status:

  • Here is a Fx25-specific issue which we have a fix for but haven't needed: bug 923912 - SkiaGL PDF.js demo rendering slow and buggy
  • For Fx26, we enabled it for all GPUs: bug 902462 - Enable SkiaGL on all Android GPUs, not just NVIDIA
  • Project Page: https://wiki.mozilla.org/Mobile/Projects/SkiaGL <== you will see some b2g work in backlog
  • Currently working on making SkiaGL work for all content rendering instead of just canvas

Fennec: Canvas, Checkerboarding, Page Load, Startup Time - Geoff Brown

ETA:
Status:

Fennec: ANR, BHR - Jim Chen

ANR (App Not Responding) Reporting
About:

  • Collect telemetry about hangs on the Fennec main thread
  • Used to identify slow code or deadlocks on user devices
  • bug 833990 - Android App Not Responding (ANR) Reporting

ETA: Improved dashboard by Q4
Status:

  • Reporting in production since Fx22
    • Enabled on Nightly and Aurora channels
    • bug 826053 - Detect and report ANRs through our own channel
  • Dashboard in progress
    • bug 834086 - Server backend for Android App Not Responding (ANR) reporting

BHR (Background Hang Reporting)
About:

  • Collect telemetry about hangs on background threads
  • Used to identify slow code and deadlocks on user devices
  • Verify responsiveness goals in actual usage

ETA: (tentative) Telemetry backend and dashboard by Q1
Status:

  • Reporting in production since Fx28
    • Enable for compositor thread
    • bug 909974 - Background thread hang monitoring
    • bug 932865 - Background thread hang reporting
    • bug 940737 - Monitor IPC thread hangs using BackgroundHangMonitor

Firefox OS - Mike Lee

Power Harness

  • Continued progress; expecting to announce updated monitoring tool next week.
  • Shipping harnesses to Automation team week after next.

will-animate CSS Property (1.3)

  • Keeps portions of gaia-apps in a scrollable layer. Showing 20~40ms perf improvement. Landing in a few key gaia apps
  • Working with Benoit Girard on Graphics team.

Scrolling FPS

  • Improved Contacts scrolling in 1.2 by ~7fps via bugs 942397 & 942398

Launch Latency

Memory

  • Tarako (128MB RAM) device upcoming (1.3)
  • Supporting work
    • bug 945973: System-wide FxOS memory reporter tool from Nicholas Nethercotes' Low-level System tools team.
    • bug 917717: Add some memory consumption tests on datazilla; collaboration between FxOS Perf and Automation Teams.

Telemetry server - mreid, jonasfj

ETA:
Status:

  • Dashboards @ telemetry.mozilla.org are currently broken. Jonas is working on it, and expects things to be fixed by the end of next week (13 Dec) at the latest.
  • bug 946401 In an attempt to communicate things like this to the user, we want to incorporate a notification stream into the dashboard (most likely as a twitter feed)
  • Server infrastructure is stabilizing (AWS CloudFormation)
  • We are expecting a possible large increase in the # of pings submitted per Bug 863872 - FHR reporting Telemetry enabled on more systems than for which we are receiving pings
  • Old Telemery Hadoop cluster is now offline - exported 1 year of historical data

Diagnose and report desktop power usage - rvitillo

ETA: 29
Goals:

  1. write a suite of automated tests to collect desktop power usage on Windows 8, OSX 10.9 and Linux (Ubuntu 13.10),
  2. compare how firefox behaves when idle w.r.t. chrome, IE and safari,
  3. reduce power usage during idle due to timers,
  4. repeat the exercise on major websites (e.g. www.techweekeurope.co.uk/news/microsoft-ie-browser-energy-power-saving-118619).

Project Proposals / Areas that need work

A review of new project ideas.

Project Area

FxOS Performance Wiki: https://wiki.mozilla.org/B2G/Performance

Other Notable Fixes

A list of other notable perf related bug fixes that don't fit within any of the projects defined above.