Personal tools

Auto-tools/Projects/AutoPhone

From MozillaWiki

Jump to: navigation, search

Contents

Team

  • mcote
  • bc
  • ctalbert (original implementation)

Problem

AutoPhone mascot

AutoPhone is a platform for running automated tests on physical mobile devices--phones and tablets. AutoPhone is responsible for tracking, verifying, and recovering devices.

Goals & Considerations

  • Support at least two hundred phones, potentially split amongst several host machines.
  • Verify that a phone is working correctly: sd card is writable and not full, sut agent running, etc.
  • Attempt to recover a phone that reports errors, rerunning the current test/test framework.
  • Provide at least a high-level status for all phones: whether they are idle, running a test, or disabled/broken.

Non-Goals

  • Provide a particular test framework. AutoPhone should be generic enough to run any framework: mochitest, talos, reftest, etc.
  • Record and present test results. Frameworks are responsible for gathering and reporting results. In the long run this should be DataZilla where possible.

Design and Approach

AutoPhone is a multithreaded, multiprocess system. The main process is the controller. It itself has one thread for a TCP listener for user commands and phone registrations, one process for a pulse listener, and one thread that listens for worker messages. It spawns and controls one worker subprocess for each phone.

Implementation

Source code is at https://github.com/mozilla/autophone/

Command Thread

This thread listens for TCP connections (default port 28001). There are two expected sources of connections: phones and users. The phones use the connection only for registration messages from the SUT agent when it first starts. AutoPhone maintains a cache (JSON file) of known phones (the --restarting option is required to maintain this cache between AutoPhone restarts). Any phone registering over the command port is added to the list of known phones.

There are also a number of user commands that can be issued over the command port:

  • status: describes the status of all known phones.
  • disable <serial num>: disables a phone. Any currently executing test run will complete, but no more will be started.
  • reenable <serial num>: attempts to reenable a phone after AutoPhone has disabled it. Use this if a phone required manual maintenance and is being added back to the pool.
  • trigger <file or path>: start a test run against the given build, which can be a path to a local file or a URL.
  • stop: stops autophone.

Pulse Listener Thread

A dedicated thread listens to the pulse server via the pulsebuildmonitor Python package. When a new build is detected, the workers either start a new test run, if one isn't currently running, or they queue the request.

Worker Status Thread

The worker status thread simply listens for updates from the worker subprocesses and makes them available to the status command.

Worker Subprocesses

The main command thread spawns a subprocess for each phone. These worker subprocesses are responsible for

  • starting test runs, after receiving an appropriate command from the main process
  • between tests, pinging the phone to ensure that it is still reachable and that the SUT agent is still running
  • recovering phones after detecting errors
  • logging status messages and forwarding them onto the main process

If a worker has a problem communicating with a phone, it attempts to recover the phone by rebooting it. If it cannot fix the problem after 3 tries, the phone is disabled. An email is sent out with the ID of the phone and the reason for disabling it. The phone can be reenabled via the "reenable" command, e.g. after manually fixing it.

PhoneDash

At the moment, AutoPhone only runs s1/s2 start-up tests. These are reported to a small web app called phonedash (https://github.com/markrcote/phonedash/), deployed to http://phonedash.mozilla.org/.