Performance/2013-09-12

From MozillaWiki
Jump to: navigation, search

Performance Review -Thursday, Sep 12, 2013

Active Projects - https://wiki.mozilla.org/Performance#Active

A review of all in-process projects.

Incremental Cycle Collection (ICC) - mccr8 (bug 850065)

ETA: Firefox 27
Status:

Keep JS-accessible APIs from blocking the main thread - yoric

ETA:
Status:

  • General Tooling
    • Mechanism for making asynchronous services shutdown-safe. ETA for v1: FF 26.
    • Making OS.File shutdown-safe, ETA for v1: FF 26.
    • Making Session Restore shutdown-safe, ETA for v1: FF 26.
    • Making Add-On Manager shutdown-safe, ETA for v1: FF 26 (handled by irving).
    • Improving asynchronous error-reporting, part 1, landed (with paolo).
    • Improving asynchronous error-reporting, part 2, ETA: FF 27 (with paolo, JSAPI team).
  • Session Restore Refactoring (with ttaubert, smacleod)
    • The result of caching proved disappointing in most cases. We spent some time tracking down the issue and found the big issue to be in cookie collection. Patches are on the way. ETA: FF 27.
    • e10s refactoring is basically ready, we just want more tests before landing. ETA: FF 27.
    • Caching communications postponed until we have landed e10s. ETA: FF 27.
    • General clean-up in progress, including two long-standing cases of data loss.
  • Making workers more useful
    • sqlite for chrome workers (handled by yzen), ETA for v1: FF 27.
    • OS.File refactoring has been blocked by a host of Worker bugs. Deprioritized for the moment. No ETA.
  • Making Telemetry I/O non-blocking (low priority)
    • Refactoring in progress. ETA of next steps: FF 27.

Profiler Backend for Mobile - jseward

ETA: NewLib-standalone demo: end Oct. Incorporated in M-C: end of Year.
Status: Compared performance of CFI unwinding on Breakpad vs Valgrind; found Valgrind to be approximately 30x faster. Even with my best hacks, Breakpad CFI/EXIDX unwind is coming in around 6600 insns/frame, which is terrible. Did two blog posts about it. Investigations show it will be difficult to achieve the same performance in Breakpad without majorly rearchitecting it. Also Breakpad is single-threaded and overly complex for our needs. Designed and started work therefore on a new unwinder library specifically tailored for our needs, taking code from Breakpad where possible.

Reducing time to first paint - Phase 1 - vladan

ETA: Firefox 27
Status:

  • Worked on other tasks over the past weeks, I'll make this a priority again for September

Replace Addon Manager SQLITE with JSON file - irving

ETA: Firefox 26
Status:

  • All core patches landed, some follow up bugs still in progress
  • Key issue is shutdown-time async I/O (bug 911621, bug 913899)

Addon Manager - irving (additional add-on manager perf fixes)

ETA: Incremental improvements landing over the next few cycles
Status:

Smooth Tab Animation - avih

ETA: No active projects beyond fx-team working on tab animation on UX.
Status: Helping fx-team analyze TART results.

TART - avih, jmaher

ETA: Firefox 26
Status:

  • TART landed.
  • Proves reliable and useful, already detected regressions, of which some were fixed as well (and reflected in TART).
  • Updated the talos wiki.

Replace talos tsvg,tscroll with the new X versions - avih, jmaher

ETA: Firefox 26
Status:

  • tsvgx and tscrollx are now in production, already detecting regressions which the old tests didn't.
  • Seems useful and reliable so far.
  • Some bimodal results on windows 7, but most probably unrelated to tscrollx.
  • Old tests ran in parallel for few weeks, and are discontinued these days.
  • Updated the talos wiki.

Network Cache Rewrite - jduell

ETA: Firefox 29+ (Q4)
Status: The basic news is that the HTTP cache rewrite is now in review, and we should be landing it pref'd off fairly soon. Then we'll explore turning it on (in which trees and which products). There's still some code for the new cache that needs to be added (mostly for cache entry eviction) before we can turn it on for any real users. Once we land that and do a last round of performance benchmarking, I think we'll want to expose it to some audience (possibly just an internal set of volunteers at first), then hopefully quickly proceeding to wider A/B testing on nightly.

I also expect there will be followup work for some perf improvements, notably reordering necko event delivery to use a priority model instead of FIFO, and that may be post-Q4.

Complete asynchronous history API - Marco Bonardo, Asaf Romano

ETA: Firefox 27
Status:

  • delayed due to issues with guids, but should still be on track for the 27 release
  • patches pending review for async Places transaction manager
  • working out some regressions due to recent changes

Reduce impact of bookmarks backups - Marco Bonardo, Raymond Lee

ETA: Firefox 27
Status:

  • final backend patches pending review
  • some frontend changes still to be done
  • have to verify status of the project and check eventual missing pieces

Reduce Storage connections main-thread operations - Marco Bonardo, David Teller

ETA: Firefox 26
Status:

  • We now have a completely off main thread implementation of asynchronous connection, including shutdown. Landed.
  • just tracking a regression (developers side, not user facing)
  • consumers conversion will happen iteratively in the next months, but Sqlite.jsm consumers and asyncClose consumers are already getting perf wins.

Improve Text Performance- jet

ETA: Firefox 27
Status:

Downloads API rewrite - paolo

ETA: Firefox 26
Status:

  • Slow SQL "UPDATE moz_downloads" reduced to 6% of previous value
  • Plan to remove the preference that disables the new code in Firefox for Desktop
  • Active outreach to extension developers this week before the Aurora migration
  • Planned API changes, and final review before the Aurora migration
  • Feedback on known temporary regressions and also on positive additions
  • 9 of 10 bugs tracking release landing probably today or tomorrow
  • One minor regression in the Downloads Panel to investigate (count of downloads)
  • Conversation with Sam Foster of the Metro Firefox team to evaluate switching to the new API
  • More details in firefox-dev:

Reduce impact of thumbnailing - Drew Willicox, Mark Hammond

ETA: Firefox 29+ (Q1 2014)
Status:

  • Background thumbnail service enabled on Nightly, Aurora. It (re)captures thumbnails shown on about:newtab that are older than two days. Captures are triggered when you open about:newtab.
  • But we just landed a patch on fx-team to make it capture only when thumbnails are missing.
  • Still ironing out correctness bugs, intermittent test failures, e10s-related crashes.
  • Work ongoing to reduce the frequency with which thumbnails are captured in the foreground. Just landed a patch on fx-team to skip capture when a thumbnail is < two days old. Next step is to stop capturing all open tabs and instead capture only those tabs shown by about:newtab and other thumbnail consumers.

Project Proposals / Areas that need work

A review of new project ideas.

(sewardj) Taras & I discussed briefly yesterday, the idea of incorporating some kind of record/replay mechanism in the top level event loop, so as to make it possible to collect real use sessions, rerun them on an instrumented/profiled build at a later time, and aggregate profiles across multiple runs, so as to get a more realistic picture of user-perceived slowness out there in the wild. Just my 2 Euro-cents, no idea if this is feasible/desirable, #include <disclaimer.h>, etc.

  • target Q4

(Yoric) We have just been bitten by the lack of shutdown-safety for some of our asynchronous services. Working on this, I witnessed fishy behaviors in some parts of the code. I believe that reworking async services to make them shutdown-safe and increase the asynchronicity/concurrency of shutdown is very much needed. This involves rewrites of subsets of FHR, Places, ...

(Yoric) Slow JSON I/O reporting. Or at least Telemetry to tell us whether async I/O is causing janks. Sounds like something that can be entrusted to a (good) mentoree.

  • Jonas working on I/O reporting, should incorporate this
  • Communication between threads should be incorporated into project async

(taras) We have hit all of the top 10 slow SQL issues in the past year. Nice work

(taras) Telemetry cutting over to new Telemetry server Oct 1. Will be some pain. Bear with us.