Auto-tools/Projects/Autolog
Goal
The Autolog project seeks to implement a TBPL-like system for viewing test results produced by the a-team's various tools, at least those of which aren't hooked up to TBPL. Such projects potentially include mobile automation, Crossweave, profile manager, etc.
Project Phases
Phase 1 (Q2 2011)
- front-end
- essentially replicate TBPL, minus the tree-management functions
- back-end
- a python server that queries ElasticSearch and returns data in JSON
- a bugzilla cache to make bugzilla queries fast
- a hg cache to make hg queries fast
- a documented REST API that test tools can use to submit test results
- a python library for python test tools to make result submission easier
- storage for test log files on brasstacks
- integration
- hook up at least one test tool (TPS) to autolog
Future Phases
- front-end
- intelligent display of stack traces, reftest images, and other extended data
- ability to edit, delete orange comments
- display test results by product, instead of just by tree
- better display of TBPL data, including builds and talos runs
- display cumulative stats for tests, possibly by way of OrangeFactor
- back-end
- a cache to reduce load on ElasticSearch and make the ES queries faster
- storage for test log files on a metrics server
- multi-threaded server to handle requests more efficiently
Implementation
Back end
Data is stored in ElasticSearch; the same instance that's used for OrangeFactor. Data is segregated from WOO data in separate indices. Data is published to the Autolog ES indices using the mozautolog python library.
Data Structure
There are two types of data structures we are concerned with. One is the structure of data that a test suite will have to provide in order to insert test results into ElasticSearch. The second is the structure of data inside ElasticSearch itself.
For the former:
{
// testgroup definition
'harness': 'tinderbox',
'testgroup': 'mochitest-other',
'machine': 'talos-r3-fed64-044',
'testsuite_count': 1, // supplied by python lib
'starttime': 1297879654,
'date': '2011-02-16', // supplied by python lib
'logurl': '...', // optional
'os': 'fedora12',
'platform': 'linux',
// 'testrun' is an implementation-specific identifier which
// is unique among a set of related testgroups; it is used to
// differentiate multiple sets of testgroups which may be run
// against the same changeset
'testrun': '...',
// Base product definition: the primary product under test. For
// Crossweave, this is the fx-sync code; for Android it is
// the mobile-browser code. In a TBPL-like display, this
// product's rev would be displayed in the "commit" column.
'productname': 'sync',
'tree': 'fx-sync',
'branch': '1.7', // optional?
'revision': '553f7b1974a3',
'buildtype': 'xpi',
'buildid': '20110210030206', // optional
'version': '1.7.pre', // optional
'buildurl': '...', // optional
// Secondary product definitions: additional products involved
// in the test. For Crossweave or Android, this might be
// 'mozilla-central', etc. There can be as many secondary
// products as needed.
'tree2': 'mozilla-central',
'branch2': 'default', // optional?
'revision2': '553f7b1974a3', // optional for secondary products
'buildtype2': 'opt',
'buildid2': '20110210030206', // optional
'version2': '4.0b13pre', // optional
'buildurl2': '...', // optional
// Testsuite definition. This is an array (to support cases like
// mochitest-other); only one member is shown in the example below.
'testsuites': [
// for cases other than mochitest-other, this is probably the
// same as 'testgroup' above
'suitename': 'mochitest-ally1',
'cmdline': '...', // optional
'testfailure_count': 1, // provided by python lib
'elapsedtime': 665, // in seconds
'passed': 85152,
'failed': 1,
'todo': 124,
// These are failures that occur during specific test cases.
'testfailures': [
// 'test' is null in cases where the error cannot be assigned to a specific test case,
// e.g., crashes that occur after all tests have finished
'test': 'xpcshell/tests/toolkit/components/places/tests/autocomplete/test_download_embed_bookmarks.js',
// per test logs, optional
'logurl': '...',
// Like testsuite errors, each member of 'failures' can contain
// additional metadata depending on failure type.
'failures': [
'status': 'TEST-UNEXPECTED-FAIL',
'text': 'Acceleration enabled on Windows XP or newer - didn't expect 0, but got it'
]
],
]
}
For the structure in ElasticSearch, the data will be separated into three document types (by the python library if that's used; the test suite will have to do this if it's posting to ES directly via HTTP), similar to the way that tinderbox logs are spread across three document types at present:
- a testgroup document (corresponding to tinderbox build documents, example here)
- one or more testsuite documents (corresponding to tinderbox testrun documents, example here)
- one or more testfailure documents (corresponding to tinderbox testfailure documents, example here)
Q: Why do we separate the data into three document types, why not just use one big document?
A: Because searches in ElasticSearch are must faster and easier with basic data types; searching inside complex nested JSON is slower and the syntax is much more complex.
Q: Can't the python library automatically provide 'os' and 'platform'?
A: It would be nice, wouldn't it? Unfortunately, there are lots of things which can confuse the issue; e.g., if you're using mozilla-build on Windows, it will see your 64-bit version of Windows as win32, regardless of what you're testing. Similarly, we sometimes test 32-bit Mac stuff on macosx64. It seems safest to have the test tools provide this data instead of trying to guess.
Q: Why do we have both testgroup and testsuite?
A: It's entirely to support mochitest-other. :( In most cases, each testgroup will have 1 testsuite.
Q: Where are the test runs in this structure?
A: We've been using the term 'testrun' to mean different things in different places. In this structure, I imagine 'testrun' to mean the same thing as it does in OrangeFactor: that is, a collection of testgroups that are run against the same primary changeset.
Q: Is this really the best way to include data about multiple products, or code from multiple repos?
A: I'm not sure. I suggested this structure because it's easy to use when searching ES. Other structures are possible. For instance, we could create a 'product' document type, and store all the products there, and then just include indexes to this document in the 'testgroup' document. The downside to this is that getting certain data out of ES would require multiple queries.
Open Issues
Log Storage
Some tools will need to store logs somewhere that can be served by autolog on request; we can't store these in ES, so where should they go? Options include stage, brasstacks, or some alternate solution provided by metrics. In order to engage other teams, we'll likely need guesstimates about total storage needed per month and some idea about the retention policy.
TBPL Data Structure
The basic unit of data in TBPL is the push. A push according to TBPL looks like this:
"b853c6efa929": {
"id": 19218,
"pusher": "dougt@mozilla.com",
"date": "2011-03-17T20:50:37.000Z",
"toprev": "b853c6efa929",
"defaultTip": "b853c6efa929",
"patches": [
{
"rev": "b853c6efa929",
"author": "Doug Turner",
"desc": "Bug 642291 - crash [@ nsBufferedInputStream::Write] demos.mozilla.org motovational poster. ipc
serialization does not work here, removing it. r=bent a=blocking-fennec",
"tags": {
"length": 0,
"prevObject": {
"length": 0
}
}
}
]
},
Additionally, each push can have a 'results' key, which contains all the results associated with that push. If the 'results' key exists, it looks like this:
'results': {
'linux': {
'opt': {
'Reftest': [ an array of machineResults ],
'Mochitest': [ an array of machineResults ],
etc,
},
'debug': {}
},
'linux64': {}, etc
}
Each 'machineResult' looks like this:
"1300280775.1300281487.29409.gz": {
"tree": "Firefox",
"machine": {
"name": "Rev3 WINNT 5.1 mozilla-central opt test mochitests-2/5",
"os": "windowsxp",
"type": "Mochitest",
"debug": false,
"latestFinishedRun": (a reference to the last finished run for this machine),
"runs": 0,
"runtime": 0,
"averageCycleTime": 0
},
"slave": "talos-r3-xp-039",
"runID": "1300280775.1300281487.29409.gz",
"state": "success",
"startTime": "2011-03-16T13:06:15.000Z",
"endTime": "2011-03-16T13:19:01.000Z",
"briefLogURL": "http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1300280775.1300281487.29409.gz",
"fullLogURL": "http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1300280775.1300281487.29409.gz&fulltext=1",
"summaryURL": "php/getSummary.php?tree=Firefox&id=1300280775.1300281487.29409.gz",
"revs": {
"mozilla-central": "ee18eff42c2e"
},
"notes": [],
"errorParser": "unittest",
"_scrape": [
" s: talos-r3-xp-039",
"<a href=http://hg.mozilla.org/mozilla-central/rev/ee18eff42c2e title=\"Built from revision e18eff42c2e\">rev:ee18eff42c2e</a>",
" mochitest-plain-2
11855/0/292"
]
'push': (reference to the push this belongs to),
'getTestResults': a function,
'getScrapeResults': a function,
'getUnitTestResults': a function,
'getTalosResults': a function,
},
All of this gets fed into UserInterface.js in the handleUpdatedPush function().
UI
There are three main aspects of TBPL that we want to present in Autolog:
- a list of on-going test runs
- a colour-coded list of (abbreviated) names of on-going test suites for each test run
- popups with detailed test data
The first two are dynamically updated, with newly started runs being inserted at the top, and test suites changing colours as they change states.
Additional, useful information:
- list of currently failing tests
- filters
- dropdowns for metadata (abbreviations, tree info, etc.)
Since TBPL's UI is pretty clean and crisp, we should try to reuse TBPL's HTML and CSS. JavaScript might be more problematic, depending on how closely tied it is to the underlying layers.
Client/Server
TBPL puts all the work into the client. The client-side JavaScript is responsible for querying tinderbox directly.
Autolog will have a relatively thin server side with a caching layer to limit the number of queries going to the ES database. Since the UI will only display a subset of the total data for each test run, the smaller set can be cached and refreshed periodically. Requested for test-run details will still go directly to the ES database.
Setting up a Development Environment
Pre-requisites:
- pyes 0.14 (easy_install http://pypi.python.org/packages/source/p/pyes/pyes-0.14.0.tar.gz)
- easy_install requires the python-dev package on linux
- mozautoeslib (http://hg.mozilla.org/automation/mozautoeslib/)
- autolog (http://hg.mozilla.org/automation/autolog/)
- requires webob and paste, available in the python-webob and python-paste packages in Ubuntu
Steps:
- Setup a local instance of ElasticSearch for development purposes, and populate it with test data, see README-testdata.txt. By default, this will operate on http://localhost:9200/
- Start the autolog server in the autolog repo, using
python autolog_server.py .(yes, include the dot at the end) - Host the autolog repo using a webserver; I use Apache but presumably nginx or anything else would work equally well.
- Navigate to index.html in the autolog repo; depending on how you've configured your webserver this might look something like http://localhost/autolog/
Notes:
- The test data is inserted into the local ES using relative dates, i.e., "5 minutes ago". If you are testing code the day after you added the test data, you might want to reset the test data so that it appears with more recent dates. To do so, use these commands (from the autolog repo):
python testdata.py --wipe python testdata.py
- You can view the raw data returned by the autolog server using this url: http://localhost:8080/testgroups