ReleaseEngineering/How To/RelOps Hardware Controller (Roller)

From MozillaWiki
Jump to: navigation, search

https://github.com/mozilla-platform-ops/relops-hardware-controller

How to Use Roller

Summary

  1. Failover/Setup
  2. Allow the ssl certificate at https://roller1.srv.releng.mdc1.mozilla.com/api/v1/workers/auth-ssl/jobs
  3. Request an action
    1. Login to https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/
    2. Navigate to a worker
    3. Click an action for that worker
  4. Wait to receive the action results
    1. Read the email
    2. See the message in IRC


Failover/Setup

To set up the actions in taskcluster. Or failover to switch roller to mdc2 from mdc1:

Change the url used in the actions assigned to the provisioner. Copy the following node javascript file locally. You will need to have the taskcluster library installed (`npm install taskcluster-client`).

   #!/usr/bin/env node
   
   let taskcluster = require('taskcluster-client');
   
   let provisionerId = 'releng-hardware';
   let url = 'https://roller1.srv.releng.mdc2.mozilla.com:443'
   url += '/api/v1/workers/<workerId>/jobs?provisioner_id=<provisionerId>&worker_type=<workerType>&worker_group=<workerGroup>&task_name=';
   let data = {
     actions: [
     {
         name: 'ping',
         title: 'ping',
         context: 'worker',
         url: url + 'ping',
         method: 'POST',
         description: 'ping server',
     },
     {
         name: 'reboot',
         title: 'reboot',
         context: 'worker',
         url: url + 'reboot',
         method: 'POST',
         description: 'reboot hardware',
     },
     ],
   };
   
   let q = new taskcluster.Queue();
   q.declareProvisioner(provisionerId, data).then(function(response) {
       console.log('provisioner:', JSON.stringify(response, null, 2));
   });

Then sign in to taskcluster to add the authentication variables to your commandline environment. And execute the above script.

   $ taskcluster signin
   # export auth vars
   $ node ./roller_update_provs.js

Then, as described below, the users need to accept the self-signed certificate (users can open a roller url like https://roller1.srv.releng.mdc2.mozilla.com/api/v1/workers/t-yosemite-r7-100/jobs?provisioner_id=releng-hardware&worker_type=gecko-t-osx-1010-beta&worker_group=mdc2&task_name=ping).

And finally the actions then work through roller1.srv.mdc2.mozilla.com


Actions

Example showing roller development actions in tools.taskcluster.net.

Roller is configured through releng-puppet. The actions are added manually to the tools.taskcluster.net provisioners, and are visible through the worker views. To use the actions, you must allow the ssl certificate in your browser (see below) and have appropriate taskcluster scopes; The scopes matching the actions are defined in puppet like "project:releng:roller:ping".

Results

The taskcluster tools interface issues the request to Roller, and Roller authorizes the action against the TaskCluster auth server. If your logged-in taskcluster id has scopes to perform the action, Roller queues the action and responds that the action is accepted. This is the check-mark success that is shown in the tools.taskcluster.net interface.

The actual results for the action are sent to your email address (it assumes that your email address is tc_clientid@mozilla.com) and directly to you on Mozilla's irc server (also assumes that your irc nickname is your tc_clientid).

  • Email to tc_clientid@mozilla.com
  • IRC personal message from taskcluster to tc_clientid

This out-of-band return of results is the simplest current approach; The tools.taskcluster.net website shows pass/fail synchronous request results only.

What does the SSL failure look like from the tools.taskcluster.net interface?

Allow the self-signed SSL certificate

Load a url for roller: https://roller1.srv.releng.mdc1.mozilla.com/api/v1/workers/auth-ssl/jobs (This matches the urls for the actions on tools.taskcluster.net like https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-linux-talos/workers/mdc1/t-linux64-ms-004)

Your web browser will report that the certificate is invalid and may ask to allow the certificate. Choose to allow it, or add the exception to allow it. When successful, you will receive an error from the roller server; this is because it is not receiving your TaskCluster account information. You cannot make api calls to roller directly.

Once the certificate is allowed, the requests to roller from your view of tools.taskcluster.net will not be blocked.

Example of accepting the roller SSL certificate.