Auto-tools/Projects/AlertManager: Difference between revisions

Jump to navigation Jump to search
m
re work to new format
m (re work to new format)
Line 1: Line 1:
= Overview =
= Team =
 
* Talos Sheriffs (:jmaher, :vaibhav1994, :mishravikas)
* Developers (:jmaher, :kaustabh93, :mishravikas, :vaibhav1994)
* Others (:avih, :wlach, :dminor)
 
= Problem =


Alert Manager is a simple single purpose tool for managing Talos based automated alerts.
Alert Manager is a simple single purpose tool for managing Talos based automated alerts.
Line 7: Line 13:
Alert Manager provides a WebUI for us to triage, categorize, investigate, and manage the large alert volume while keeping our performance regressions well documented.
Alert Manager provides a WebUI for us to triage, categorize, investigate, and manage the large alert volume while keeping our performance regressions well documented.


= How it works =
 
= Goals & Considerations=
 
''Requirements and relevant constraints of the project, including main function, interface, and security concerns.  The "what".''
 
= Non-Goals =
 
''Anything of note that is specifically not going to be accomplished in this project.  The "what not".''
 
= Dependencies / Who will use this =
 
'' Outline what technologies, apis, tools this depends on.  We also want to outline who our intended audience is if it isn't called out specifically in the Problem statement''
 
= Design and Approach =


The automated alerts are sent to the [[https://groups.google.com/forum/#!forum/mozilla.dev.tree-management mozilla.dev.tree-management newgroup]].  We have a script which parses all the alerts and puts them into a sql database.  We care about:
The automated alerts are sent to the [[https://groups.google.com/forum/#!forum/mozilla.dev.tree-management mozilla.dev.tree-management newgroup]].  We have a script which parses all the alerts and puts them into a sql database.  We care about:
Line 24: Line 43:
An end user takes action on this by looking at the data before/after the changeset in question.  It is common practice to retrigger at least once if not 3 times in order to show that this is really the offending changeset.  At this time we build builds which were skipped if we can do that.  Once we have determined the changeset that caused the problem, we file a bug and add it into the webUI so we can easily reference the bug.  Here is a link to a [http://elvis314.wordpress.com/2014/05/08/the-lifecycle-of-a-talos-performance-regression/ graphic view of this workflow].
An end user takes action on this by looking at the data before/after the changeset in question.  It is common practice to retrigger at least once if not 3 times in order to show that this is really the offending changeset.  At this time we build builds which were skipped if we can do that.  Once we have determined the changeset that caused the problem, we file a bug and add it into the webUI so we can easily reference the bug.  Here is a link to a [http://elvis314.wordpress.com/2014/05/08/the-lifecycle-of-a-talos-performance-regression/ graphic view of this workflow].


= Alert Manager Roadmap =
= Milestones and Dates =


Our goal is to provide the best set of tools to manage all performance regressions at Mozilla.  To achieve that here are some changes we need to consider implementing:
Our goal is to provide the best set of tools to manage all performance regressions at Mozilla.  To achieve that here are some changes we need to consider implementing:
Line 39: Line 58:
Eventually this will be integrated into tbpl's replacement [https://treeherder.mozilla.org/ui/#/jobs treeherder].  Until TreeHerder is proven and has a solid performance data storage backend and display UI, we are continuing our work here.  All features implemented here will act as a beta version with much of the logic and code to be used in TreeHerder.
Eventually this will be integrated into tbpl's replacement [https://treeherder.mozilla.org/ui/#/jobs treeherder].  Until TreeHerder is proven and has a solid performance data storage backend and display UI, we are continuing our work here.  All features implemented here will act as a beta version with much of the logic and code to be used in TreeHerder.


= Want to help? =
 
= Implementation =
 
''Technical notes, plans, and designs detailing how the project will be realized.  The specifics of "how". This should also include how we expect this to be used and typical use cases''
 
= Getting Involved =
 
== Getting Started ==


You can always contact :jmaher or :dminor, or :Kaustabh93 on irc.  To find the code go here:
You can always contact :jmaher or :dminor, or :Kaustabh93 on irc.  To find the code go here:
Line 45: Line 71:


Some good directions are up on github on the [[https://github.com/jmaher/alert_manager/blob/master/README.md readme]].  That has sample data and everything needed to get going.
Some good directions are up on github on the [[https://github.com/jmaher/alert_manager/blob/master/README.md readme]].  That has sample data and everything needed to get going.
== Expectations ==
''Links to coding styles, patch/pull request guidelines, unittest requirements''
== Bugs ==


Here are some bugs to get your feet wet and start making progress:
Here are some bugs to get your feet wet and start making progress:
Confirmed users
3,376

edits

Navigation menu