BugzillaAutoLanding: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Lando is the new autoland)
 
(39 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=Project Description=
This project has been superseded by [https://lando.services.mozilla.com Lando], which hooks into [https://phabricator.services.mozilla.com Phabricator].
This project will create and deploy a set of tools that can grab patches from a bug in Bugzilla, land them on try, poll for the results of that push and if the results meet criteria auto-land the patch(es) on trunk, poll for results, comment back to the bug with trunk landing results - backing out if the results do not meet criteria for landing on trunk.
 
=Goals=
*A set of smaller tools that each have the ability to be part of a larger script that runs "autolanding" to try & trunk.
*Tools can be used in command line
*Tools to control the process - lots of toggles to go back to manual sheriffing (Global KillSwitch)
*Using our build/test resources wisely and not increasing the load so much that try is unusable for developers to work on their patches prior to automated/assisted landings
*Streamlining as much as possible the try server and trunk landing process without removing the human interaction with the build/test/perf results.
 
=Non-Goals=
* Replacing humans in the landing process
* Handling performance regressions
* Providing the best possible UI
* Going to extreme lengths to support auto-landing patches which do not follow the rules which the tool requires
 
=People=
* Lukas Blakk
* Marc Jessome
 
=Designing the System=
A simple survey on [http://bit.ly/try_usage try usage] was created and advertised to find out how developers currently use try and what they think about the autolanding workflow. We got 52 responses. This helped catch some things we missed like paying attention to LDAP authentication before pushing something from Bugzilla, and it also provided new observations on try workflow. Many developers state that they would find landing to try via Bugzilla to be 'more work' than what is currently offered with push-to-try (using try syntax). We are taking that into account in our design and are adjusting the goal to be primarily for the purposes of landing to trunk and not just getting try results posted to the bug. That last part will be handled more through try syntax where if a bug is specified, you can post the results to the bug and turn off email notification if desired.
 
=API=
==Architecture==
[[File:Autoland_API_Overview2.png|800px]]
 
==REST Interface==
List of request methods, urls, and parameters:
 
'''GET /bugs/{bugID}''' -> get bug patchsets
 
'''GET /patchSets/{patchSetID}''' -> get patchset information
 
'''GET /branches/branchName''' -> get branch information
 
 
 
'''POST /bugs/{bugID}''' -> create empty patchset
 
'''POST /branches/{branchName}''' -> create a new branch
 
 
 
'''PUT /patchSets/{patchSetID}/{patchID}''' -> add patch to patchset
 
'''PUT /branches/{branchName}''' -> update branch
 
'''PUT /branches/{branchName}/threshold''' -> set branch threshold
 
'''PUT /branches/{branchName}/status''' -> set status enabled/disabled
 
 
 
'''DELETE /patchSets/{patchSetID}''' -> delete a patchset (if not processing)
 
'''DELETE /branches/{branchName}''' -> delete branch from db
 
==Object Definitions==
Bugs:
* bugID
* patchSets
 
PatchSets:
* patchSetID
* patches
* destination_branch (meaning final branch of autolanding)
* revision (try revision number)
* status (try, destination, complete -- where it is currently in the system)
* creation_time (time added to databaase/queue)
* push_time (time pushed to try)
* completion_time (time try results received)
 
Patches:
* patchID
* author
* reviewer
* approver
 
=Project Timeline=
* Tracking bug: {{bug|657828}}
 
{| class="fullwidth-table sortable"
|-
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Component Name'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Bug(s)'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Assigned To'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Description'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Start Date (est.)'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Completion Date (est.)'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''On Track'''
| style="background: none repeat scroll 0% 0% rgb(239, 239, 239);" | '''Updates'''
|-
| HgPusher
| {{bug|657832}}
| Marc
| [[BugzillaAutoLanding#HgPusher|Details]]
| Monday May 23rd
| Friday June 10th
| style="background: none repeat scroll 0% 0% rgb(255, 165, 0);" | No
| Integrating LDAP usage
|-
| SchedulerDBPoller
| {{bug|430942}}
| Lukas
| [[BugzillaAutoLanding#SchedulerDBPoller|Details]]
| Monday May 23rd
| Friday June 24th
| style="background: none repeat scroll 0% 0% rgb(255, 165, 0);" | No
| Working on a bug with getting change comments to not be blank once talos/tests have been entered for a buildrequest
|-
| BugCommenter
| {{bug|659167}}
| Marc
| [[BugzillaAutoLanding#BugCommenter|Details]]
| Monday June 20th
| Friday July 8th
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Yes
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Complete
|-
| AutolandDB
| {{bug|659166}}
| Marc
| [[BugzillaAutoLanding#AutolandDB|Details]]
| Monday June 20th
| Friday July 22nd
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Yes
|
|-
| MessageQueue
| {{bug|659166}}
| Marc
| [[BugzillaAutoLanding#MessageQueue|Details]]
| Monday June 13th
| Friday July 22nd
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Yes
| Module complete, needs to be used by all of project.
|-
| LDAP Tool
| {{bug|666860}}
| Marc
| [[BugzillaAutoLanding#LDAPTool|Details]]
| Monday July 11th
| Friday July 29th
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Yes
| style="background: none repeat scroll 0% 0% rgb(144, 238, 144);" | Complete to fill needs of this project
|}
 
==Testing==
June 20th - July 22nd
* Set up on staging masters as per {{bug|661634}} running the automation against the sandbox Bugzilla with autolanding/logging set to staging repos to test the individual components and watch for issues in the message queue and system.
 
==Deployment==
Go live week of Aug 8th and with only logging and watch for issues using live data from actual pushes/db
* Monitor load & machine resources/wait times
 
==API==
Write and enable API to give sheriff access to the functionality of this system. Include Kill Switch, ability to override.
 
==Post-Deployment==
* Document the project as much as possible
* File any bugs for enhancements
 
==Component Descriptions and Implementation Notes==
===BugzillaEventTrigger===
A polling script that will pull all bugs from bugzilla with whiteboard tags matching:
[autoland-$branch]
[autoland-$branch:$patchID:$patchID]
 
Default behaviour would be to take [autoland-$branch] and grab all non-obsolete, non r- patches into the patchset that is then pushed to $branch, results returned and analyzed for further actions (like try->mozilla-{central,inbound}).  To override or search for regressions we support adding explicit patchIDs to create a custom patchset for an autoland run.
 
In a discussion with Mconnor, the suggestion for user interaction with autoland is to have radio buttons by each attachment (with a bit of select all/none js) so that you could pick your patches for autoland and have an input field & submit button to send something like:
[autoland-try:1234:2345:3456] (greasemonkey script for picking patches)
[autoland-mozilla-inbound:1234]
to the queue which would signify the final push destination and the attachment ids to use for the patch set. Whether this is something we can do with Bugzilla would need investigating so at first we are implementing this with whiteboard tags & polling with the bugzilla API for these tags.
 
===HgPusher===
Accepts a branch, patch id(s) and can clone the branch, apply the patches and report back with results of push (success == revision or FAIL).  Also can handle special casing to do a backout on a branch.
* Input(s): Bugzilla Messages, command line
* Output(s): BugCommenter, hg.mozilla.org, AutolandDB, stdout
 
===SchedulerDBPoller===
Regularly polls the scheduler DB (on a timer) and checks for completed buildruns for any actively monitored branches.  There should be a list to check against for what branches need to be watched. The incomplete runs from the last N units of time are kept by revision in a local cache file to check for completion.  When an observed branch has a completed buildrun the SchedulerDBPoller can check two things:
* if try syntax is present (and branch is try) check for a --post-to-bug flag and trigger BugCommenter if flag and bug number(s) are present
* run that revision against the AutolandDB to see if that revision was triggered by landing automation.  If yes, then PUT the results & trigger the BugCommenter otherwise ditch the completed revision
 
* Input(s): command line, AutolandDB
* Output(s): BugCommenter, stdout
 
===BugCommenter===
When called with a bug number and a comment, posts to the bug and returns the commentID as well as a result (SUCCESS/FAIL) -- should handle a few retries in case of network issues -- write to a log file that is watched by Nagios?
* Input(s): HgPusher, SchedulerDBPoller
* Output(s): Bugzilla, stdout
 
Note: Let's have a couple of template options here for what is posted to the bug depending on if it's a branch or try
 
===MessageQueue===
'''To-Do''' - Read up on [http://www.rabbitmq.com/ RabbitMQ] and [http://en.wikipedia.org/wiki/Message_queue Message Queue]
Listens to messages from BugzillaScraper (or Pulse Events?) broadcasts information to HgPusher, BugCommenter, AutolandDB
* Goal here would be to integrate with existing RMQ in build infrastructure, be able to deploy the components on any masters in the build network to share load and for them to be able to send/receive messages via RMQ to see an autoland cycle through to completion
 
===AutolandDB===
Keep track of the state of an autoland-triggered push from start to finish.
 
===LDAPTool===
Checks for hg permission level of submitter
Compares bugzilla email to ldap email
 
=Notes=
* Set of tools which automate each step of this work and we can 'on-demand' turn off any of the tools
* Sheriff needs to be able to turn off the auto-landing altogether
* No limit in workflow -- sheriff should be able to override the queue to auto-land a priority patch
** can have oranges, but sheriff could 'force' the landing anyway
* We need to have stages:  
** (TBD): how to deal with bugzilla
** '''HgPusher''': deals with the hg stuff (merging patches, grabbing patches from bugs, what happens if any steps fail)
** '''SchedulerDB Poller''': deals with a way of getting results from try/m-c to make sure we have the data on whether to proceed and/or leave comment in the bug regarding outcomes
** (fourth, optional/desired/future) how to monitor perf results and get an idea of when it's ok to push or flag on those results
* Be strict in accepting things from humans
** Rules for automated landings
** If they forget a rule, they get a comment - "Step X failed (reason)"
** For bugs with multiple patches - to land all together, look at non-obsolete patches which have been reviewed in alphabetical order
** Common for devs to name things "part 1", "part 2"
** Don't need to be lenient toward human mistakes
** Need correct descriptions, author information, header of patch -- if it doesn't include a header, try syntax
** Look at "checkin-needed" box, grab all non-obsolete, and if they have the right message then push all those - if all those steps succeed then you get the try push otherwise fail with comment -- clear "checkin-needed" from the bug
* Is the bug the right place? It's the public record for the issue, so yes - emailing the assignee takes the information away from the record
* Specialized tool for patch queues would be nice (not in the scope of this project!)
* Try syntax presence == try on try_repo but NEVER auto-land on trunk
* Developers will be encouraged to watch perf results on trunk, since try perf is not really useful information
* Bot has to watch m-c after it lands something there, results come back to the bug
** Merge tracking for commits (what was auto-landed and what was not)
** Merging between two auto-landed pushes
** We have to watch these to know when all the results are in, and what was successful/not (scheduler db has this information for us to get)
* Autoland Message Queue - consumes information from BugzillaScraper (pulse?) as event triggerer for HgPusher, BugCommenter
==Security==
* Must ensure that the patch is attached by someone with L1 hg access, so that we are auto-landing patches from authors with the same level of security as current push-to-try
** Why L1?  This is about pushing to mozilla-central, so I think we should check for L3 access.
** L1 to push to try, then reviewer should have L3 for autolanding to trunk
=Deployment Coordination=
Talk with Amy and Zandr about how to deploy this without Single Point of Failure and also with consideration for load & resources. We want to have a production-level system here so what will that require?
==Setting up the staging masters==
On autoland-staging01 I have done the following:
* sudo yum install hg
* checked out my [http://hg.mozilla.org/users/lsblakk_mozilla.com/tools tools repo]
* Installed Python 2.6.7 from [http://www.python.org/ftp/python/2.6.7/Python-2.6.7.tgz source]
** ./configure, make (yum install zlib-devel, readline-devel, bzip2-devel), make install
** added /usr/local/bin to the PATH in .bashrc to use 2.6 as default Python
* Set up a virtualenv using [https://wiki.mozilla.org/ReleaseEngineering/Virtualenv these notes]
* in the virtualenv installed:
** argparse
** simplejson
** sqlalchemy
** mysql-python
* Also required (but not currently installed):
** [http://www.python-ldap.org/ python-ldap] for ldap utils
** [http://pika.github.com/ pika] for rabbitmq communication with python
** mock for tests

Latest revision as of 01:00, 25 October 2018

This project has been superseded by Lando, which hooks into Phabricator.