Perfomatic:SendingData

From MozillaWiki
Jump to: navigation, search

How Talos sends data to the graph server

Basics

  • uses http post
  • collect.cgi - adds a single data point to the db
  • bulk.cgi - adds multiple data points to the db

Variable Names

  • type - either 'discrete' or 'continuous'
  • data - any associated data for this distinct point (ie, the page being loaded)
  • tbox - machine name that sent the result (ie, qm-pmac-trunk06)
  • testname - name of the test in question (ie, tp_loadtime)
  • branch - branch number (1.9, 1.9.0, 1.8, ...)
  • branchid - a misnaming that stuck, is the buildid associated with this distinct point
  • date - the timestamp indicated when this test was run
  • time - the interval for this distinct data point
  • value - the reported value for this test result as this time

date vs. time

This really comes down to discrete vs. continuous. Continuous graphs are composed of timestamp/data pairs. Discrete graphs are bar graphs associating intervals with data (ie, (0, 10), (1, 20), (2, 30), ...). The way the current set up of the graph server works each discrete set of data is averaged and then becomes a single point on a continuous graph. So, after a full set of discrete data is collected the cgi calculates an average and puts it in an associated continuous graph stamped with 'date'.

Mapping Old Variable Names to New

For the missing variables they are either to be calculated by the collect/bulk scripts or passed from Talos

  • Tables
    • test_runs
      • date_run <=> date
    • builds
      • ref_build_id <=> branchid
      • ref_changeset <=>
      • date_added <=>
    • branches
      • name <=> branch
    • os_list
      • name <=>
    • machines
      • cpu_spped <=>
      • is_throttling <=>
      • name <=> tbox
      • is_active <=>
      • date_added <=>
    • test_run_values
      • inteval_id <=> time
      • value <=> value
    • tests
      • name <=> testname
      • pretty_name <=>
      • is_chrome <=>
      • is_active <=>
      • pageset_id <=>
    • pagesets
      • name <=>
    • pages
      • name <=> data

Return Values

For AVERAGE data send (tab delimited):

RETURN	testname	avg_result	graph.html#tests=[{"test":TESTID,"branch":BRANCHID,"machine":MACHINEID}]

For VALUES data send (tab delimited):

RETURN	testname	graph.html#type=series&tests=[{"test":TESTID,"branch":BRANCHID,"machine":MACHINEID,"testrun":TESTRUNID}]
RETURN	testname	avg_result	graph.html#tests=[{"test":TESTID,"branch":BRANCHID,"machine":MACHINEID}]

'tests' in url hash is a JSON array with each index a JSON object containing:

  • Test id
  • Branch id
  • Machine id

Rewrite For New Schema

  • push data into new schema, will require having Talos send more data along with possibly renaming the variables names for the current data sent (an opportunity to redo naming in a more sane manner)
  • still have to have the scripts correctly calculate and return the avgresult for discrete sets (you can see in bulk.cgi/collect.cgi how this is calculated, it is not a simple average but the average of the data points excluding the max)
  • still have to have the scripts provide a link to the results added to the db
  • would be nice to have Talos send a start/end of data notification; right now Talos sends data and hopes that it has been added correctly and does not attempt re-sends or any sort of error handling - if we sent start/end of data notification would could do some smarter data processing and also do re-sends on failure

Sample Send Data

Two different types of data to be sent:

  1. A single value to be stored as the 'average' in the test_runs table
  2. A set of (interval, value) pairs to be stored in the test_run_values table, 'average' to be calculated by collector script

First type will be called 'AVERAGE' second called 'VALUES'. All data is formatted using comma separated notation.

date_run = seconds since epoch (linux time stamp) page_name = is unique to pages when combined with the pageset_id from test table

  • for sending interval, value pairs
START
VALUES
machine_name,test_name,branch_name,ref_changeset,ref_build_id,date_run
interval0,value0,page_name0
interval1,value1,page_name1
...
intervalEND,valueEND,page_id
END
  • for sending a single value
 START
 AVERAGE
 machine_name,test_name,branch_name,ref_changeset,ref_build_id,date_run
 value0
 END

Examples

values input:

START
VALUES
machine_1, test_1, branch_1, changeset_1, 13, 1229477017
1, 1.0, page_01
2, 2.0,page_02
3, 3.0,page_03
4, 1.0,page_04
5, 2.0,page_05
6, 3.0,page_06
7, 1.0,page_07
8, 2.0,page_08
9, 3.0,page_09
10,1.0,page_10
11,2.0,page_11
12,3.0,page_12
END

response:

Content-type: text/plain 

RETURN\ttest_1\tgraph.html#type=series&tests=[{"test":45,"branch":3455,"machine":234,"testrun"=6667}]
RETURN\ttest_1\t2.00\tgraph.html#tests=[{"test":45,"branch":3455,"machine":234}]

average input:

START
AVERAGE
machine_1, test_1, branch_1, changeset_1, 13, 1229477017
2.0
END

response:

Content-type: text/plain

RETURN\ttest_1\t2.00\tgraph.html#tests=[{"test":45,"branch":3455,"machine":234}]