Deep in the bowels of the thundery mountain Mt. Useradvocacy, exist a group of beneficent Perl fairies named Hilda, Jake and Emma, who--between tiny snack breaks consisting of whiskey and crumpets--keep an eye on the belching orifice that is known as the Input abyss and send out periodic emails when specific putrid expulsions trend upwards. However, they were old and weary and their eyes stung from the intense heat of the hot, sulfuric gases that hung in the air like a thick fog.
It is time for us to relieve them of these doldrums so that they may retire to a happier life.
This project covers rewriting the system that sends out email alerts for Input feedback.
- Project owner(s): Robert Rayborn, Cheng Wang, Will Kahn-Greene
- Status: In-progress
- January 16th, 2015: Robert and Cheng vidyo'd Will and told him about how we're killing the fairies and they need a break. Will wrote up the initial project plan.
- February 23rd, 2015: Will finished fixing the auth api bits on Input and Cheng's emitter emitted the first round of alerts and Alerts v1 was done!
- The items marked (input) will get done by Input devs on Input
- The item marked (ua) will get done by the elite crack UA squadron. The requirements for the dashboard and whimsification are out of scope for this project plan since this specifically covers Input work. They're mentioned here because they will be consumers of the database
This phase will focus on the following:
- (input) setting up database tables to capture Input alert events over time so that we have a historical archive of them
- (input) setting up GET and POST API endpoints for creating and reading alerts; API endpoints will be authenticated
- (ua) rewriting the Perl script into Python -- this will be an emitter of alerts (script shared here if you have mozilla.com credentials)
- (ua) setting up the script to run in cron on the Input admin node
- (ua) writing a dashboard (with enhanced whimsyfication for whimsy users) that looks at the database tables and exposes the data in pleasing and helpful ways
Requirements for Input work:
- database tables on Input for holding alert data
- authenticated GET API on Input
- authenticate POST API on Input
- some way for generating and maintaining authentication credentials for API on Input
- cron job will run every hour
- Python script will initially use maths taken from the Perl script
- Every variable in the logic should be easily tuned, they will likely be changed quite a bit
- FIXME: database requirements
- 1. Extract: Pull data for two time periods from the DB
- After: Last twelve hours
- If < 50 comments skip this run
- Before: Previous 90 days (minus last 12 hours)
- Desktop ("product LIKE 'firefox'")
- English ("locale = 'en-US'")
- Sad ("happy = 0")
- Not a campaign ("(campaign IS NULL OR campaign = '')")
- Not another source ("(source IS NULL OR source = '')")
- After: Last twelve hours
- 2. Aggregate: count the number of comments that each stemmed word occurs in
- Throw out abusive comments
- Ignore stop-words
- 3. Filter: remove words with the following
- Throw out words with less than two occurrences in the "After" group
- The percent of comments with the word before must be at least 5 percentage points higher than the percent of comments with the word after
- The percent of comments with the word before must be at least 100% higher than the percent of comments with the word after
- [UA] 4. Report: Return the stats/comments
- User advocacy will be doing this piece (we think)
- An example of the current output can be found here (if you have Mozilla credentials)
- Feel free to take liberties in how you'd like to format this output in the most meaningful way. Cheng and Rob would be happy to discuss ideas.
Tracker bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1130599
|1130762||[alerts] create django app, models and modelfactories for alerts||P1||RESOLVED|
|1130765||[alerts] create authenticated get/post api for alerts||P1||RESOLVED|
|1135376||[alerts] authorization isn't working in production||P1||RESOLVED|
3 Total; 0 Open (0%); 3 Resolved (100%); 0 Verified (0%);
- Docs for API: http://fjord.readthedocs.org/en/latest/alerts_api.html
- Docs for Tokens: http://fjord.readthedocs.org/en/latest/alerts_api.html#authentication-token
Deadline: Q2? Q3?
Move to HEKA? This is probably blocked on moving Input to AWS first.
Machine Learning for classification instead of simple stemming for the emmiter