Changes

Jump to: navigation, search

CloudServices/Sagrada/Metlog

1,303 bytes removed, 17:22, 4 October 2011
no edit summary
* Rob Miller
* Victor Ng
 
== User Requirements ==
 
=== Phase 1 ===
The first version of the Metrics system will focus on providing an easy mechanism for the [[Services/Sync| Sync]] and [https://browserid.org/ BrowserID] projects (and any other internal Mozilla services) to efficiently send profiling data and any other arbitrary metrics information that may be desired into one or more backend storage locations. Once the data has made it to its final destination, there should be available to those w/ appropriate access the ability to do analytics queries and report generation on the accumulated data.
Requirements:
* Service Apps using primary Services Python framework apps should be provided an easy to use API that will get certain pre-defined allow them to send arbitrary text data into the metrics captured for them "for free" (iand reporting infrastructure.e. w* Processing and I/o having to explicitly request them). Configuration options will O load generated by the API calls made by the services apps must be provided extremely small to prevent some or all of this data collection from happening allow for minimal impact on app performance even when this there is a requirement for user privacy and/or security reasonsvery high volume of messages being passed.* Service Apps will be provided API should provide a simple mechanism for inserting arbitrary data points into the metrics systemmetadata to be attached to every message payload.* All inserted data will Overall system should provide a sensible set of message categories so that commonly generated types of messages can be transparently (to labeled as such, and so that the service app) processed processing and passed into reporting functionality can easily distinguish between the appropriate back end storage and analytics destinationvarious types of message payloads.* Service app owners have access Message taxonomy must be easily extendable to an interface (or interfaces) where they can perform arbitrary queries against the data points support message types that have been capturedare not defined up front.  === Phase 2 === The second phase of will focus on improving the back end reporting infrastructure. Once data has started flowing and there is an opportunity * Message processing system must be able to assess which reports and graphs would be most generally usefuldistinguish between different message types, so the goal will to various types can be routed to make it as easy as possible the appropriate back end(s) for existing effective analysis and new service app owners to get to their informationreportingRequirements: * Service app owners must have access to an interface (or interfaces) where they can access a set that will provide reporting and querying capabilities appropriate to the various types of predefined reports and/or graphs & charts displaying useful information based on captured data points (e.gmessages that have been sent into the system. unique daily users, average time elapsed handling requests). 
== Proposed API ==
The full atomic unit for the Services Metrics infrastructure will consist of a couple of different APIssystem is the "message". The first will be structure of a mechanism for sending performance- and ops-related datamessage is inspired by that of the well known syslog message standard, limited with some slight extensions to increment counters and timers (i.e. time elapsed allow for completion of a certain operation), into a [https://github.com/etsy/statsd statsd] setup, which will ultimately feed into a [http://graphite.wikidot.com/ graphite] installationarbitrary metadata. The second Each message will provide a way to capture arbitrary text data, analogous to syslog-style log entries, with each record accompanied by a set consist of string tokens that will identify the type of payload the record contains as well as any other metadata that may be useful for analytics and/or processing.following fields:
The statsd API will be achieved by * timestamp: Time at which the message is generated.* logger: String token identifying the message generator, such as the inclusion name of existing statsd client librariesthe service application in question.* severity: Numerical code from 0-7 indicating the severity of the message, as defined by [https://githubtools.comietf.org/jsocolhtml/pystatsd pystatsdrfc5424 RFC 5424] for the Python services and [https.* message://githubMessage text payload.com/sivy* metadata: Arbitrary set of key/node-statsd node-statd] for node.js-based services. The core Python service app platform value pairs that Services provides will already contain pystatsd calls capturing basic information such as successful login counters, total time elapsed for HTTP request handling, etc. Inclusion indicates the type of a 'statsd = false' setting in the app configuration will prevent this message that is being generated and includes any additional data from being collectedthat may be useful for back end reporting or analysis.
The API for general metrics data collection We will provide a "MetLog" library that will both ease generation of these messages and that will handle packaging them up and delivering them (via UDP) into the message processing infrastructure. Implementations of this library will likely be available in both Python and Javascript, but the Python library will be minimalavailable first and this document will, for now, only describe the Python API. For Python apps we The Javascript API will provide a `metlog` library be similar, modulo syntactic sugar that is available in Python but not in JS (e.g. decorators, context managers), and will provide be documented in detail in the following functionsfuture. The proposed Python API is as follows:
'''set_metlog_dest(host, port)'''
call themselves.
'''metlogset_default_logger(tokens, msglogger)'''  Sends a single log message to the previously specified metlog listener. ''tokens'' should be a sequence of string tokens containing any metadata required to identify and classify the message, while ''msg'' should contain the main data payload. This will be serialized into a simple format and sent via UDP to the listener, "fire and forget"-style for minimal performance impact on the calling application.
A similar library can be constructed in Javascript Specifies a logger value to use as the default for use all subsequent ''metlog'' calls in node.js applicationswhich a logger value is not explicitly provided.
'''set_message_flavor(flavor_name, metadata)'''
The first iteration of this solution will not require metadata for a great deal of engineering given message can be used to implement, as it will leverage lots of infrastructure label and categorize that is already in place message. The statsd client will be configured to talk to the statsd services that are already planned to be running on every Services hostThis function expects a string value ''flavor_name'' and a dictionary ''metadata''. The stats gathered from the various hosts will flavor name value can be sent on passed in as a ''flavor'' to subsequent ''metlog'' calls as shorthand for including the Services Ops graphite installation, which will aggregate and graph specified metadata in the stats for developer consumptionoutgoing message.
The '''metlog portion will similarly make use of existing infrastructure. Services Ops is going to have instances of the [http://logstash.net/ logstash] service in place which will be processing the log output from our various processes. We will write a UDP listener input module for logstash which will be the metlog listener. Logstash will then batch these messages and will construct HTTP requests providing collections of messages to a [https://github.com/mozilla-metrics/bagheera Bagheera] instance provided to us by the Metrics team. The messages will ultimately end up in a Hadoop data store. Access to a [https://hive.apache.org/ Hive] interface will be available to allow for construction of arbitary queries against any of the data that has landed.(timestamp=None, logger=None, severity=6, message="", metadata=None, flavors=None)'''
Sends a single log message to the previously specified metlog listener.
Most of the arguments correspond to the message fields described above.
None of them are strictly required, but most of them will be populated by
reasonable defaults if they aren't provided:
== Use Cases == * ''timestamp'': Defaults to current system time. * ''logger'': Defaults to what has been specified using the=== Sync === ''set_default_logger'' call, or to an empty string if ''set_default_logger'' hasn't been called.=== BrowserID === * ''severity'': Defaults to 6 ("Informational") * ''message'': Defaults to an empty stringThe BrowserID team has started specifying their metrics gathering requirements, described in some detail in [https * ''metadata''://bugzilla.mozilla.org/show_bug.cgi?id=679139 Bug 679139]. The conversation attached Defaults to that bug focuses primarily on specific information that can be extracted from captured log files. While there is useful information to be obtained from the logs, itan empty dictionary * ''flavors'': Any specified flavors will cause this message's already evident that some inference will need to be made, and certain information will need metadata value to be explicitly prevented from being processed updated to ensure sufficient levels of user privacy. The ability to capture and store arbitrary data points from within contain the code itself will simplify collection of certain data points, and the ability flavor's metadata; defaults to use statsd timers will provide application performance metrics that would be impossible from log files alone.an empty list
Confirm
125
edits

Navigation menu