Auto-tools/Projects/Pulse: Difference between revisions

 
(35 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= Introducing Pulse =
= Introducing Pulse =


https://pulse.mozilla.org/
Pulse is a managed [http://www.rabbitmq.com RabbitMQ] cluster designed to provide loose coupling between automation and infrastructure tools.  The goal of Pulse is to add visibility to Mozilla's tools and systems and to eliminate polling and other brittle methods of scraping data. This allows more robust, dynamic, and informative tools.


Mozilla currently has a ton of different systems that are inter-connected via polling, screen scraping, email, and other brittle methods. To make their lives easier community members often build tools on top of this house of cards, adding yet another level of scraping and polling. Many systems don't even export important data for others to scrape and use, preventing better tools from being written.
Pulse is available at pulse.mozilla.org:5671 (AMQP over SSL).  It is hosted by [http://cloudamqp.com CloudAMQP].


The goal of Pulse is to eliminate polling and add visibility into all aspects of Mozilla and its systems. This allows more robust, dynamic, and informative tools.
[[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]] is a tool that manages Pulse's users and queues (and eventually exchanges).  It is available at https://pulseguardian.mozilla.org and hosted by [http://heroku.com Heroku].


We have a discussion forum available via the standard trio of [news:mozilla.tools.pulse USENET newsgroup], [https://lists.mozilla.org/listinfo/tools-pulse mailing list], and [https://groups.google.com/forum/#!forum/mozilla.tools.pulse Google Group].
We have a discussion forum available via the standard trio of [news:mozilla.tools.pulse USENET newsgroup], [https://lists.mozilla.org/listinfo/tools-pulse mailing list], and [https://groups.google.com/forum/#!forum/mozilla.tools.pulse Google Group].


File bugs under [https://bugzilla.mozilla.org/enter_bug.cgi?product=Webtools&component=Pulse Webtools :: Pulse].
File bugs under [https://bugzilla.mozilla.org/enter_bug.cgi?product=Webtools&component=Pulse Webtools :: Pulse].  We don't have a separate component for PulseGuardian; rather, we just start the summaries with "[PulseGuardian]".


Also see the [https://tools.taskcluster.net/pulse-inspector/ Pulse Inspector] web app, which displays Pulse messages in real time, and the (manually updated) [[/Exchanges|list of Pulse exchanges]].
Also see the [https://tools.taskcluster.net/pulse-inspector/ Pulse Inspector] web app, which displays Pulse messages in real time, and the (manually updated) [[/Exchanges|list of Pulse exchanges]].
Line 15: Line 15:
= System Description =
= System Description =


Pulse isn't any one thing.  At its heart, it is a RabbitMQ system with a particular configuration and a set of conventions for using it along with a management tool, [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]], to make Pulse as automated and self-serve as possible.  Pulse follows the pub-sub pattern, in which publishers send messages to topic exchanges, and consumers create queues bound to these exchanges in order to subscribe to the publishers' messages.
Pulse isn't any one thing.  At its heart, it is a RabbitMQ system with a particular configuration and a set of conventions for using it along with a management tool, [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]], to make Pulse as automated and self-serve as possible.  Pulse follows the pub-sub pattern, in which publishers send messages to topic exchanges, and consumers create queues bound to these exchanges in order to subscribe to the publishers' messages.  In general, publishers create and own exchanges, and consumers create and own queues.


= Specification =
= Specification =
Line 27: Line 27:


Pulse credentials are managed and issued by [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]],
Pulse credentials are managed and issued by [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]],
available at https://pulse.mozilla.org. This service SHALL issue
available at https://pulseguardian.mozilla.org. This service SHALL issue
an ''accessToken'' for any ''clientId'' that is registered
an ''accessToken'' for any ''clientId'' that is registered
with ''authorized'' email address.
with ''authorized'' email address.
Line 43: Line 43:
== Publishers ==
== Publishers ==


Publishers MUST name ''exchanges'' in the form
Publishers MUST name ''exchanges'' in the form <code>exchange/<clientId>/<name></code> where clientId is the userid used to bind/connect to the server. Attempts to name an exchange otherwise SHALL result in an authorization error. Exchanges MUST be ''topic exchanges'' and they MUST be declared ''durable''.
<code>exchange/<clientId>/<name></code>; attempts to name an exchange
otherwise SHALL result in an authorization error. Exchanges MUST
be ''topic exchanges'' and they MUST be declared ''durable''.


Messages MUST contain a UTF-8-encoded [http://tools.ietf.org/html/rfc7159 JSON] payload, and
Messages MUST contain a UTF-8-encoded [http://tools.ietf.org/html/rfc7159 JSON] payload, and
Line 126: Line 123:
= Let's Use It =
= Let's Use It =


There are currently two pulse clients available. Please note that you can also connect to pulse in other languages, provided you have an AMQP 0.9.1 library that will let you interact with AMQP exchanges. See https://github.com/rabbitmq/rabbitmq-tutorials#languages for example.
There are currently two Pulse clients available. Please note that you can also connect to Pulse in other languages, provided you have an AMQP 0.9.1 library that will let you interact with AMQP exchanges. See https://github.com/rabbitmq/rabbitmq-tutorials#languages for example.


== Python Pulse client library ==
== Python Pulse client library ==


The [https://pypi.python.org/pypi/MozillaPulse mozillapulse] Python package provides classes for existing publishers, consumers, and messages so you can quickly build Pulse applications.
The [https://github.com/mozilla-services/mozillapulse mozillapulse] Python package provides classes for existing publishers, consumers, and messages so you can quickly build Pulse applications.  See the [https://github.com/mozilla-services/mozillapulse/blob/master/README.md README] to get started (although note that the test publisher is currently offline; see {{bug|1218976}}.  You can use another consumer, e.g. BuildConsumer, to verify your setup.).


This library is somewhat inflexible, however, and should be rewritten. One idea is to turn TaskCluster's Python client into a standalone package.


== Go (golang) Pulse client library ==
== Go (golang) Pulse client library ==


This can be found here http://petemoore.github.io/pulse-go/.
This can be found here:
* http://taskcluster.github.io/pulse-go/


Please note there are also extensions for the TaskCluster specific exchanges here (see section "AMQP APIS"): http://taskcluster.github.io/taskcluster-client-go/.
Extensions for TaskCluster exchanges here (see section "AMQP APIs"):
* http://taskcluster.github.io/taskcluster-client-go/


= Contributing =
= Contributing =


To set up a local system for development, see the [https://hg.mozilla.org/automation/mozillapulse/file/tip/HACKING.md HACKING.md] file included in the mozillapulse source.
To set up a local system for development, see the [https://github.com/mozilla-services/mozillapulse/blob/master/HACKING.md HACKING.md] file included in the mozillapulse source.


Here is a the list of open, unassigned mentored Pulse bugs to see how you can contribute!
The main Pulse library (mozillapulse) and publisher shims (pulseshims) are written in Python, although there is also a Go library as mentioned in the section above.  We also want to provide a canonical JavaScript library at some point.  To hack on the main Pulse library, you should be comfortable in Python, and it's helpful to understand the basics of AMQP.  Knowledge of kombu is also useful.
 
To hack on PulseGuardian, you should know some Python and JavaScript.  Experience with Flask, SQLAlchemy, and RabbitMQ are useful, but you can probably learn what you need as you fix bugs.
 
Feel free to stop by #pulse or #ateam with questions!
 
Here is the list of open, unassigned, mentored Pulse and PulseGuardian bugs to get you started.


<bugzilla>
<bugzilla>
Line 166: Line 172:
</bugzilla>
</bugzilla>


For mentored bugs, we use the User Story to provide a link back to this page, as well as any extra information for contributors, such as required knowledge/learnings.  The basic text for mentored bugs should be "This is a mentored Pulse bug.  For general information on Pulse, see https://wiki.mozilla.org/Auto-tools/Projects/Pulse, which includes a section on Contributing."  An example of extra text is "This bug also requires you to have a working mail server."
For mentored bugs, we use the User Story to provide a link back to this page, as well as any extra information for contributors, such as required knowledge or tools.  The basic text for mentored bugs should be "This is a mentored Pulse bug.  For general information on Pulse, see https://wiki.mozilla.org/Auto-tools/Projects/Pulse, which includes a section on Contributing."  An example of extra text is "This bug also requires you to have a working mail server."
 
= Consuming Buildbot messages =
 
There are two ways to consume messages published by Buildbot.  The most direct way, which requires the most knowledge about Buildbot, is using the BuildConsumer in [http://hg.mozilla.org/automation/mozillapulse mozillapulse].  This consumer has access to all the native Buildbot messages, and therefore offers the most flexibility.
 
The disadvantage of using the BuildConsumer is that you need to spend time understanding what messages Buildbot publishes to pulse, and how these can vary, and associate particular messages with what you're trying to accomplish.  The format of Buildbot messages is undocumented, and can change without warning, which makes services based on the BuildConsumer potentially fragile.
 
To address some of these disadvantages, a translator is run against the BuildConsumer (the [https://github.com/mozilla/pulsetranslator pulsetranslator]) which re-publishes a subset of Buildbot messages to a NormalizedBuild exchange, which are available using the NormalizedBuildConsumer of mozillapulse.  The content of these messages is simplified and normalized, making it easier to consume without the need to have a thorough understanding of how Buildbot publishes messages to pulse.  The re-published messages also protect consumers against some changes to the pulse stream, although significant enough changes will likely break the pulse translator as well as direct users of BuildConsumer.
 
Another advantage of the NormalizedBuildConsumer is that it will only publish messages for a given build or test job after the logs for that job are available; using the BuildConsumer directly can result in the reception of messages for a build before the build artifacts are available, which can cause problems in consumers if they don't explicitly guard against this problem.
 
Generally speaking, consumers that wish to be notified when specific build or test jobs are completed should use the NormalizedBuildConsumer; consumers that need direct access to the Buildbot pulse stream or are looking for non-specific jobs (such as all jobs belonging to a particular commit) should probably use the BuildConsumer.


= Road Map =
= Road Map =


See the [http://mzl.la/1pc2F3M prioritized bug list] for all open issues and feature requests.
See the [https://bugzilla.mozilla.org/buglist.cgi?resolution=---&query_format=advanced&component=Pulse&product=Webtools prioritized bug list] for all open issues and feature requests.


= Security Model =
= Security Model =
Line 202: Line 196:


= Admin Procedures =
= Admin Procedures =
dustin and the taskcluster team have access to the Pulse cluster on CloudAMQP and the following related services:


* PulseGuardian should be deleting queues that are too long. If you need to manually delete a queue, use the Management UI. Try to ping the queue owner first before killing if possible.
* PulseGuardian should be deleting queues that are too long. If you need to manually delete a queue, use the Management UI. Try to ping the queue owner first before killing if possible.
* pulsetranslator service, which normalizes Buildbot messages, is currently running on pulsetranslator.ateam.phx1.mozilla.com and may need to be reset from time to time.
 
* logparser service, used by [http://brasstacks.mozilla.com/orangefactor/ Orange Factor], runs on orangefactor1.dmz.phx1.mozilla.com
== To upgrade a ssl certificate on pulse.mozilla.org ==
 
Open a bug with IT to generate a new certificate https://bugzilla.mozilla.org/enter_bug.cgi?product=Infrastructure%20%26%20Operations&component=SSL%20Certificates
See {{bug|1532325}} for an example.
 
IT needs to email support@cloudamqp.com with the new cert.  The cloudampq support team will install it on all our of cloudampq nodes.  After it has been installed, you can login to the administrative start the nodes one by one which will not result any downtime.  (Ensure you wait for the node to restart before restarting another one.)  Verify that the certs are installed on the nodes
 
* https://sslanalyzer.comodoca.com/?url=pulse.mozilla.org
* https://sslanalyzer.comodoca.com/?url=orange-antelope-01.rmq.cloudamqp.com
* https://sslanalyzer.comodoca.com/?url=orange-antelope-02.rmq.cloudamqp.com
* https://sslanalyzer.comodoca.com/?url=orange-antelope-01.rmq.cloudamqp.com
 
This should show the dates for the new certificate and that the cert is trusted by Mozilla and Microsoft.
 
Cloudampq updated their web page since we last did this so that you should be able to upload the cert yourself and have it propagate.  See the admin console under "Certificate".


= More reading =
= More reading =
Confirmed users
1,201

edits