ReleaseEngineering/Tooltool
One fairly common problem faced by teams at Mozilla is getting their tools plugged into the Release Engineering continuous testing infrastructure. In the past, the workflow has been for the team to file a bug that required a Release Engineer to figure out how to build, package and deploy a new set of tools. This is not the most scalable approach. During our recent Release Engineering workweek, we brainstormed on how to improve scaling and agility by empowering developers to help build and deploy new tools for our CI machines.
Tooltool basics
Tooltool is a client side program written in Python that uses a file manifest in concert with HTTP servers to materialize large binary payloads for use in a job. The manifests are JSON files which list details of individual files. Each file is represented in the JSON by a dictionary with the keys “filename”, “digest”, “size” and “algorithm”. An example is located " here. The current " JSONEncoder and " JSONDecoder derived classes only understand how to work with these keys. Making the JSON encoder and decoder work with an extensible version of the manifests should not be difficult.
Tooltool fetch API
When Tooltool needs to download a file from the file server it does so by creating a URL. In the current implementation, there is a single base url that is used to construct the URLs that are to be fetched. A good change to make would be to convert this single URL to a list of possible base urls. The address of the file to fetch is generated using a string concatenation of the base URL, a slash, the hashing algorightm’s name, another slash and the full hash value represented in hexadecimal. Given a base url of "http://files.r.us:8080/tooltool” and a file that has a SHA512 hash value of 0123456789abcdef, Tooltool will try to fetch"http://files.r.us:8080/tooltool/sha512/0123456789abcdef”.
There is no server component for Tooltool yet. As a result, the file server is currently implemented as a simple directory on an HTTP host that has the correct directory structure for responding to requests. I've started working on a server side component but it isn't finished. The server side component would allow for easy uploads by developers, easy listing of contents on the server and a way to store files in a nicer way on the server's file system.
Using Tooltool
Tooltool has four commands presently: list, validate, add and fetch. There are global options and command arguments. All terminal interactions after the option parser finishes is done through the Python logging API. The default is to print logging.INFO and higher messages. Currently, the following global options exist:
-q/--quiet
tells Tooltool to print only logging.ERROR and higher messages-v/--verbose
specifies to print logging.INFO and higher-m/--manifest <file>
instructs Tooltool to reference a manifest file located at the specified path-d/--algorithm <algorithm>
instructs Tooltool to use the specified algorithm-o/--overwrite
tells Tooltool to overwrite a local file if the filename matches the manifest but the hash value is different to the manifest--url
specifies the base url to be used for remote operations
Listing and Validating
The two most basic commands list a manifest and validate the local files against the manifest. The list command lists out all of the files in the manifest as well as whether they are present and/or valid. The return code from listing is zero unless there was an error in listing the files. Absent or invalid files will still result in an exit code of zero if there was no error in the listing process. The validate command is used to check if all the files in the manifest are present and valid. The exit code for validating is zero if all files in the manifest are present and their hash matches the manifest. It is non-zero if any file is missing locally or the file does not have the same hash as the manifest.
Fetching
The fetch command is used to materialize files locally. Before fetching a file from a remote host, Tooltool will validate local files which match the filename specified in the manifest. The default behaviour when a local file matches the filename but not the hash value is to exit with a non-zero exit code. If --o or --overwrite is specified on the command line Tooltool will overwrite the local file without confirmation with the remote file. The local file will be truncated as soon as Tooltool attempts to start writing the remote file locally.
Adding
Files are added to manifests with the add command. This command looks for a manifest using either the default name of <a href="https://github.com/jhford/tooltool/blob/1936dd6109544eed8216637fbac92f31c5e920a1/tooltool.py#L493">manifest.tt</a> or by the value specified by the -m/--manifest command line option. If the manifest does not exist already on the file system, a new one will be created to store the first file given. An error message will be displayed if a file is added a second time, regardless of whether or not the local contents are the same as what is in the manifest. There is currently no logic to remove a file from a manfiest or overwrite a manifest entry.
Future direction and contributions
Tooltool is a generic lookaside cache. My intention is to keep it as general as possible by not including logic to deal with payloads. We currently include a bootstrap script in the b2g manifests that understands how to take the rest of the payload and set up a working toolchain. Using a bootstrap script means that Tooltool can be tool agnostic while still allowing complex operations on the fetched tools. A standardized bootstrap script name and interface that is called by Tooltool might make sense. I'd also like to finish the server-side component that has an interface to upload files with and a method for storing files in a less obtuse method than just mirroring the API paths. A command for removing files from a manifest would be helpful, as would the ability to update a manifest with the contents of the directory as they exist right now. Another really cool feature would be the ability to configure a system wide cache directory where files are downloaded to. Once the server component is working, I'd like to add a way for servers to automatically sync their file stores when they are asked for a file that exists on another server but not locally.
Tooltool is GPLv2 and the source is available on <a href="https://github.com/jhford/tooltool">github</a>. The best way to contribute code is to send a pull request on github. Things are more likely to be merged if they pass the whole test suite (make check), have tests, improve Tooltool and don't make Tooltool overly specific.