Kick off meeting for the Signal from Noise project

Agenda

Decide on IRC backchannel for project - use an existing or #signalfromnoise?
- Use #ateam

Motiviation

This is a project to...

Make talos numbers more reliable and less noisy
Make talos numbers more senstive to detect performance regressions
Create better tools for tracking and analyzing talos data and performance regressions
Make talos easier and less error prone to deploy
Carry on the research that Slewchuk started to other talos perfomance suites
Make the changes he and the metrics team recommend so that we achive the above goals
Big Picture: The quintessential number from talos is the talos Tp number - it tracks our rendering/layout/drawing performance on a set of 100 pages representative of the web. Our goal (to be realized in Q2) is to make this metric more reliable, less noisy, more sensitive and provide developers the tools they need to track down regressions.

Summarize Lewchuck's Research (Saptarshi)
- Stephen looked at all the raw data of talos
- Tried to understand number of data points we needed to detect small regressions in talos
- Tried to understand how to minimize noise in the tests for each page.
- He found that aggregating the numbers was hiding a lot of effects, and so changing that aggregation method would make the talos system more reliable and effective a tool.

Summarize Milestones (Joel)
- We want to take the work that stephen did and replicate it across the other tests that we have.
- Stephen started with tdhtml suite.
- When the numbers change, they need to be staged in side-by-side

First milestone goal is to get the changes rolled out for tdhtml - get us a dry run to get the tools changed, do the side-by-side staging run figured out, and show the new data in the graphserver
also need to get started with performing the same analysis that stephen worked on to other tests (Tp)
- will be working with metrics to help them run experiments on their pine branch
- stephen has a set of tools that he used to do this analysis, we can extend them as needed
- Analyze a new graph server backend?
- We should be able to come up with metrics of how much less noisy the results are for SxS stage (e.g. std deviation)

- Rolling tdhtml out side-by-side on other branches
- Figure out parameters for other tests - tsvg/ta11y, and get them ready for side-by-side rollout

- Should have tools and processes ready to perform the rollout of a new Tp test in Q2.

update graphserver: it is a PITA changing the schema :/
Graphserver backend is too much trouble to keep updating, would be better to transition it into a big bucket to store values. Might go with non-relational data for the backend.
Do less in graphserver with the data wrt SQL calls; see e.g. http://hg.mozilla.org/graphs/file/1237d38a299b/server/pyfomatic/collect.py#l208
- formats:
  - - redis?
Also want to get some new mocks and ideas together for ways to analyze the new data
We want to make processing flexible: should be able to swap out filtering functions for first iteration (more complicated analysis would desire e.g. chaining them)
- Need to define a format that defines a filter; currently filters do way too much (conversion from strings, parsing, etc); filters should only have to do with numbers
What to do about compare-talos?
- We will fold the compare-talos use cases and functionality into the graphserver, because comparetalos while useful is unowned and unaffiliated with any changes we may be making to how the numbers are calculated, how the suites are defined etc. So, it makes sense for it to be part of the graphserver itself.
Review the thesis the guy from new zealand did http://majutsushi.net/stuff/thesis.pdf

Jeads will be sending mockups to the group about directions we might do to graphserver
- Jeads and rhelmer will meet up w.r.t. graphserver changes after the mockups are done
Christina/Saptarshi will follow up with Joel about talos questions for their metrics research
Joel to set up meeting 11am Pacific time on thursdays.
Joel/Jhammel to make changes for dropping the first value rather than the highest value from tdhtml
Armenzg to help with sXs staging of that ^