Fennec/PerformanceReBoot

From MozillaWiki
Jump to navigation Jump to search

Current Results

Goals From Toronto

  • Security: maintain parity with the currently shipping version of Firefox Mobile
  • Stability: achieve parity with the currently shipping version of Firefox Mobile. Ratio of number ADUs to number of weeks to be specified before 10/31/11.
  • Performance: Raw Start should not exceed 200ms. Warm start with page load should stay within a certain % range of the built-in browser, and we should be faster than Opera. We will pinpoint the exact metric prior to 10/31/11.
  • Memory Usage: Will stay conservative so that the browser is kept alive and supports swapping/round tripping from browser to apps.
  • Responsiveness: Scrolling Panning, Zooming, and Frame Rate should meet or exceed customer expectations when compared to other mobile browsers. Numeric metrics and hardware targets to be determined prior to 10/31/11.
  • Features and Design:Deliver all features with P1 and P2 priority and be opportunistic about improving Ux and UI design but not at the expense of performance and responsiveness.
  • Be fabulous: Although this effort is about re-jigging for a fresh platform, we are on the lookout for differentiators. Jay will continue this orthogonal effort so as to not put the baseline plan at risk. Browser ID or Reading mode is a good candidate to get started with.
  1. http://bit.ly/A01sAd <-- after the offsite we procured manual benchmarks from softvision; here are their results
  2. http://vimeo.com/user9209060/review/32178771/0f04087905 < --- Initial video benchmark. Fennec Native vs. stock, opera, dolphin
  3. https://docs.google.com/spreadsheet/ccc?key=0Arku3jleCA0UdFVQS211Z29sV202MFdMdE5ldHA1NkE&hl=en_US#gid=3 <-- naoki's weekly S0, S1, S2 <-- bug 711515 also affects the semi automated way of getting s0,s1,s2. automated values are invalid
  4. http://10.250.2.223:8100/#/xbrowserstartup <-- Brass Tacks Currently blocked by bug 711515
  5. Tpan and Tzoom <-- refresh on where we are (joel)
  6. Where we are today: Are we measuring what we need to? Are we performant enough to go beta? How do we know?
  7. Two different audiences: Jay and engineering

Jay's Requests

We can use numbers: see the above put from toronto at the end of the day often the request is side-by-side comparison via stopwatch. Good to have video; frequency should be determined.

  1. Intention with video is to be able to give startup and rendering/painting fps numbers meaning so we could make changes that would help improve user perception.
  2. Stopwatch measuring is not accurate; just a ballpark, video is needed to help bridge the gap.

Engineering's Requests

S0, S1, S2 <-- these don't need video today; easier to just use automation for and gather numbers (?)

  1. We need to catch regressions quickly enough so that engineering has a rapid response time; proposal is to obtain them once a day
  2. We need these numbers to understand where we need to optimize
  3. We need to make sure that these numbers are useful for engineering: https://docs.google.com/spreadsheet/ccc?key=0Arku3jleCA0UdFVQS211Z29sV202MFdMdE5ldHA1NkE&hl=en_US#gid=3

Frequency

  1. Currently once a week;
  2. Once a day for S0, S1, S2 tests - Brad owns the client-side client bits. we need to instrument all three; and get it reported: need to scope these tasks; a couple of scripts. Someone has to be on-point to watch tree management.

Start logcat, execute these strings. Clint's team would use the same harness. You will no longer have data for 3rd party browsers. Auto file bugs; automate the email. We need to make sure the bugs are filed and a regression range. We want the results to look like Talos but we need to use actual phones. What is the threshold? At what point does it become statistically significant for us to care. Do we have to run the tests multiple times? Do we need this intermediate solution or why isn't this just a talos test? Running every nightly on the bank of phones. Make it a talos test as long as we commit to this in the short timeframe. Retool by the end of the week-by Friday. Ask Jay about weather we need the time between onload firing, throbber stopping, and page painting.

  1. If we need videos, what is the frequency?

Task Breakdown

Automation

one time tasks, not camera specific

  1. Phones rooted, A-Team's android agent installed
  2. Phones integrated with a test server (in this case it was my system, which had to be set up)

Repeated Tasks

  1. .ini files need to be updated for perameters such as ip address, sd card presence, etc. before the tests run (this would need to happen

prior to each camera run if we shoot deployed phones *other* than the hardware ctalbert has already set up in haxxor. This is not difficult or time consuming but worth stating.

Camera Capture

HW Scope: Droid Pro, Nexus S, Galaxy S2 SW Scope: Fennec Native Nightly weekly + Stock, Opera, Dolphin periodically The above mirrors the exact matrix of things being measured by brass tacks

  1. Turn camera on, make sure SD card and battery are seated properly in camera, and ID best placement for phones and camera
  2. Double check network settings
  3. Double check settings on phone to ensure raw or warm start
  4. Place phone 1 in front of camera, push record, run through tests on nightly, and stop recording
  5. Place phone 2 in front of camera, push record, run through tests on nightly, and stop recording
  6. Place phone 3 in front of camera, push record, run through tests on nightly, and stop recording
  7. Turn camera off and remove SD Card
  8. Place SD card into reader or in lap top; import clips
  9. Launch Google Spreadsheet and create new entry
  10. Scrub through and analyze footage to determine and record: onload firing, throbber stop, page paint for raw start and for warm start

Visual Support

All of these are complete and ready to view at meeting; need to upload to people.com

  1. screencast of why 60fps?
  2. raw start: nightly vs. stock; no cache clear
  3. raw start: nightly vs. stock, cache cleared
  4. warm start: click URL from gmail and see page load (stock vs. nightly and nightly vs. opera)
  5. Checkerboarding: iphone vs. nightly, nightly vs. stock

Round Table

  • 120 FPS possible; onload time can be pushed out through logging, other values can be captured through the video.