Auto-tools/Projects/Signal From Noise/Meetings/2012-01-12
Kick off meeting for the Signal from Noise project
- Take Notes on etherpad
- Quick around the table introductions - who you are, what you do (one sentence), favorite band
- Joel Maher - Ateam (jmaher),
- Jeff Hammel - Ateam (jhammel),
- Rob Helmer - Webdev (Rhelmer)
- Christina Choi - Metrics (cchoi)
- Saptarshi Guha - Metrics (joy)
- Chris Cooper - Releng (coop)
- Armen Zambrino Gasparin - Releng (armenzg)
- Jonathan Eads - Ateam (jeads)
- Clint Talbert - Ateam (ctalbert)
- Chris Manchester - Ateam (chmanchester)
- Decide on IRC backchannel for project - use an existing or #signalfromnoise?
- Use #ateam
This is a project to...
- Make talos numbers more reliable and less noisy
- Make talos numbers more senstive to detect performance regressions
- Create better tools for tracking and analyzing talos data and performance regressions
- Make talos easier and less error prone to deploy
- Carry on the research that Slewchuk started to other talos perfomance suites
- Make the changes he and the metrics team recommend so that we achive the above goals
- Big Picture: The quintessential number from talos is the talos Tp number - it tracks our rendering/layout/drawing performance on a set of 100 pages representative of the web. Our goal (to be realized in Q2) is to make this metric more reliable, less noisy, more sensitive and provide developers the tools they need to track down regressions.
- Summarize Lewchuck's Research (Saptarshi)
- Stephen looked at all the raw data of talos
- Tried to understand number of data points we needed to detect small regressions in talos
- Tried to understand how to minimize noise in the tests for each page.
- He found that aggregating the numbers was hiding a lot of effects, and so changing that aggregation method would make the talos system more reliable and effective a tool.
- Pageloader background (jmaher will need to double check this, but I think it's right)
- page loader loads each page in the set
- each loadtime is recorded
- pageloader drops highest value and lowest value creates a median of the page's load time (each page loaded x times)
- medians uploaded to graph server
- graph server compresses this data to one number per test run by averaging these pages and dropping the highest one.
- graph server plots this number.
- Summarize Milestones (Joel)
- We want to take the work that stephen did and replicate it across the other tests that we have.
- Stephen started with tdhtml suite.
- When the numbers change, they need to be staged in side-by-side
Milestones: Milestone 1: https://wiki.mozilla.org/Auto-tools/Projects/Signal_From_Noise#Milestone_1
- First milestone goal is to get the changes rolled out for tdhtml - get us a dry run to get the tools changed, do the side-by-side staging run figured out, and show the new data in the graphserver
- also need to get started with performing the same analysis that stephen worked on to other tests (Tp)
- will be working with metrics to help them run experiments on their pine branch
- stephen has a set of tools that he used to do this analysis, we can extend them as needed
- Analyze a new graph server backend?
- We should be able to come up with metrics of how much less noisy the results are for SxS stage (e.g. std deviation)
- Rolling tdhtml out side-by-side on other branches
- Figure out parameters for other tests - tsvg/ta11y, and get them ready for side-by-side rollout
- Should have tools and processes ready to perform the rollout of a new Tp test in Q2.
- update graphserver: it is a PITA changing the schema :/
- Graphserver backend is too much trouble to keep updating, would be better to transition it into a big bucket to store values. Might go with non-relational data for the backend.
- Do less in graphserver with the data wrt SQL calls; see e.g. http://hg.mozilla.org/graphs/file/1237d38a299b/server/pyfomatic/collect.py#l208
- - redis?
- Also want to get some new mocks and ideas together for ways to analyze the new data
- We want to make processing flexible: should be able to swap out filtering functions for first iteration (more complicated analysis would desire e.g. chaining them)
- Need to define a format that defines a filter; currently filters do way too much (conversion from strings, parsing, etc); filters should only have to do with numbers
- What to do about compare-talos?
- We will fold the compare-talos use cases and functionality into the graphserver, because comparetalos while useful is unowned and unaffiliated with any changes we may be making to how the numbers are calculated, how the suites are defined etc. So, it makes sense for it to be part of the graphserver itself.
- Review the thesis the guy from new zealand did http://majutsushi.net/stuff/thesis.pdf
- Here is each of the *jobs* we run:
- Each of these *jobs* have several measurements, e.g.:
- tp5_paint: 416.62 (details)
- tp5_%cpu_paint: 70.17 (details)
- tp5_pbytes_paint: 119.3MB (details)
- tp5_memset_paint: 124.2MB (details)
- tp5_shutdown_paint: 883.0 (details)
- tp5_responsiveness_paint: 1768.0
- Each of these numbers are a posted to graphs.mozilla.org
- Each of these numbers are an agregration
- Jeads will be sending mockups to the group about directions we might do to graphserver
- Jeads and rhelmer will meet up w.r.t. graphserver changes after the mockups are done
- Christina/Saptarshi will follow up with Joel about talos questions for their metrics research
- Joel to set up meeting 11am Pacific time on thursdays.
- Joel/Jhammel to make changes for dropping the first value rather than the highest value from tdhtml
- Armenzg to help with sXs staging of that ^