Changes

Jump to: navigation, search

CloudServices/Sagrada/Metlog

3,738 bytes added, 23:34, 26 September 2011
no edit summary
= Overview =
The '''Metrics''' project is part of [[Services/Sagrada|Project Sagrada]], providing a service for applications to capture and inject arbitrary data into a back end storage
suitable for out-of-band analytics and processing.
Requirements:
* Service Apps using primary Services python Python framework will get certain pre-defined metrics captured for them "for free" (i.e. w/o having to explicitly request them). Configuration options will be provided to prevent some or all of this data collection from happening when this is a requirement for user privacy and/or security reasons.* Service Apps are will be provided a simple mechanism for inserting arbitrary data points into the metrics system.* All inserted data will be transparently (to the service app) processed and passed into the appropriate back end storage and analytics destination.* Service app owners have access to an interface (or interfaces) where they can access a predefined set of reports and/or graphs & charts displaying useful information re: certain predefined captured data points(e.g. unique daily users).* Service app owners have access to an interface (or interfaces) where they can perform arbitrary queries against the data points that have been captured2captured.
== Proposed API ==
 
The full Services Metrics infrastructure will consist of a couple of different APIs. The first will be a mechanism for sending performance- and ops-related data, limited to increment counters and timers (i.e. time elapsed for completion of a certain operation), into a [https://github.com/etsy/statsd statsd] setup, which will ultimately feed into a [http://graphite.wikidot.com/ graphite] installation. The second will provide a way to capture arbitrary text data, analogous to syslog-style log entries, with each record accompanied by a set of string tokens that will identify the type of payload the record contains as well as any other metadata that may be useful for analytics and/or processing.
 
The statsd API will be achieved by the inclusion of existing statsd client libraries: [https://github.com/jsocol/pystatsd pystatsd] for the Python services and [https://github.com/sivy/node-statsd node-statd] for node.js-based services. The core Python service app platform that Services provides will contain pystatsd calls capturing basic information such as successful login counters, total time elapsed for HTTP request handling, etc. Inclusion of a 'statsd = false' setting in the app configuration will prevent this data from being collected.
== Proposed The API ==for general metrics data collection will be minimal. For Python apps we will provide a `metlog` library that will provide the following functions: '''set_metlog_dest(host, port)'''  Specifies the address and port of the metlog listener, the destination of the UDP packets that will be sent out as a result of subsequent ''metlog'' calls. The Services Python framework will provide a mechanism to specify this via configuration files so services authors won't have to make this call themselves. '''metlog(tokens, msg)'''
Sends a single log message to the previously specified metlog listener.
''tokens'' should be a tuple of string tokens containing any metadata
required to identify and classify the message, while ''msg'' should contain
the main data payload. This will be serialized into a simple format and
sent via UDP to the listener, "fire and forget" for minimal performance
impact on the calling application.
=== Internal Apps ===A similar library can be constructed in Javascript for use in node.js applications.
=== Clients ===The first iteration of this solution will not require a great deal of engineering to implement, as it will leverage lots of infrastructure that is already in place. The statsd client will be configured to talk to the statsd services that are already planned to be running on every Services host. The stats gathered from the various hosts will be sent on to the Services Ops graphite installation, which will aggregate and graph the stats of a similar nature for developer consumption.
The metlog portion will similarly make use of existing infrastructure. Services Ops is going to have instances of the [http://logstash.net/ logstash] service in place which will be processing the log output from our various processes. We will write a UDP listener input module for logstash which will be the metlog listener. Logstash will then batch these messages and will construct HTTP requests providing collections of messages to a [https://github.com/mozilla-metrics/bagheera Bagheera] instance provided to us by the Metrics team. The messages will ultimately end up in a Hadoop data store. Access to a [https://hive.apache.org/ Hive] interface will be available to allow for construction of arbitary queries against any of the data that has landed.
== Use Cases ==
Confirm
125
edits

Navigation menu