User:Zeller

From MozillaWiki
Jump to: navigation, search

Ship It

Bug 826753 - release automation should update ship it at certain points

Notes

  • /status - Lists all available releases, that have status data, in JSON format and can be queried with parameters (ie /status?var1=1&var2=10). Analogous to /releases
  • /statuses.html – Pretty GUI view of the info shown in /status. Analogous to /releases.html
  • /status/<release-name>
    • GET: Lists release status info in JSON. Analogous to /releases/<release-name>
    • POST: covered in Bug 1032985
  • /status/<release-name>.html – Pretty GUI view of the info shown in /status/ GET, possibly including a visual timeline of the 6 steps: Tagging completed, All builds/repacks completed, Updates on betatest, Ready for releasetest, Ready for release and Postrelease. (status steps taken from Bug 826753 comment 0)

Questions

  • Why does /release 404?
  • Should /statuses be available through a 3rd tab (ie Reviewed/Completed/Statuses), or through a new "view statuses" link?
  • Does a ViewStatuses() need to be added to kickoff.js? In addition to the ViewReleases() function?

To Do

  • Fix line 331 in model.py to sort the product properly
  • When RelMan hits 'go', add proper row to database
  • Make system diagram for interactions in shipit additions
  • Make /status GET queriable (other than var1 and var2). Other release tables use the column variables ready and complete.
  • Run a JOIN via the name column between release_status and *_releases, and print that JSON info upon /status/<release-name> GET request.
  • Take progress bars from bootstrap rather than change the entire css

Bug 1032985 - Add REST API entry point to shipit that allows shipit-agent to enter release data into shipit database

https://bugzilla.mozilla.org/show_bug.cgi?id=1032985#c2

Add REST API entry point for updating update.db at /status/<release-name> when POST

Notes

  • Use name field in release_status as an unofficial foreign key to the 3 *_releases tables (Can't use JOIN for 3 primary keys (all name) in the *_releases tables)

Questions

  • How exactly should I divide the release pulse messages into the 6 statuses: Tagging completed, All builds/repacks completed, Updates on betatest, Ready for releasetest, Ready for release and Postrelease.

To Do

  • Have StatusAPI update the database tables *-release, and check that name is in release_data
  • Check to see if name already exists in tables *-releases, and check that name is in release_data before adding to release_status... use as switch between creating a new row and updating rows.
  • DATETIME stamps need to not have T or +01:00
  • Sort data into the 6 statuses

Bug 1032978 - Add a standalone process that listens to pulse for release related buildbot messages

Long-running standalone script (shipit-agent.py) that listens to pulse and uses / status/<release-name> with POST to update update.db for <release-name> status

Notes

  • pulse.m.o gives buildbot events
    • Look for build.release-*.finished (Found in ['_meta']['routing_key']?)
    • Use FF candidate buildlogs to help determine the names of the builder I am interested in (ie the first part before "bmNN" is the name of the builder)
  • If collecting, release_data, then the table will gain ~0.5mb/release, which I estimate makes it ~117Mb/year.
  • If version = "None", then routing_key doesn't have build.release-*
  • Some pulse fields are unnecessary, such as 'exchange', and 'serializer'
  • Nothing says *.finished
  • Ignore the following messages:
    • Any routing_key that doesn't match build.release-* (including unittest.release-*, talos.release-*, etc... they are tests we don't usually look at.
    • Any version or build_number that is None... there was a bug for that, they are test related. These also seem to only show up on release messages that don't match build.release-*, like unittest.release-*, talos.release-*, etc.
    • Any xulrunner jobs, they are implicitly run by Firefox
  • [Examples https://bugzilla.mozilla.org/show_bug.cgi?id=1032985#c2] given by rail on how to sort pulse messages into the 6 statuses

Questions

  • What do do with release messages that don't fit the expected:
    • Have a product other than "firefox" when supposedly it's a firefox build (ie "mobile" or "xulrunner"
  • Did I scrap all of Firefox-31.0b5-build1 or am I missing something?
  • Should I look at routing_key, or key, or?
  • Do we want to keep all buildbot release pulse data, to be able to use it later to deduce new things, or forgoe all that entirely? (feature creep)
  • Should protocol be configurable via puppet or with command line option?
  • Is it possible to use the topic option with pulse to listen to only build.release-* messages?
  • Should unique_label have a set name?
  • Should I be using a different consumer like BuildConsumer? NormalizedBuildConsumer will miss some builds, pulsetranslator has a hardcoded list of recognized platforms, and will drop others. http://hg.mozilla.org/automation/mozillapulse/file/default/mozillapulse/consumers.py
    • Dump out the full message stream for a while until you see one of the ones you care about.
  • Should we send the concat'd data instead of payload, and have the common_keys listed in the REST API and hold all the logic for the None's and such there?
    • No because the API should only be hit with messages we care about, and so should be filtered before ever propagating a POST request to the REST API

To Do

  • Give rail an example of the product "mobile" showing up when I expect "firefox"
  • Run with supervisord if supposed to be run in the foreground
  • Read credentials from a file deploted by puppet, rather than relying on .netrc for username and password
  • Make protocol configurable via puppet or with command line option
  • Use selfserve-agent.py as the template for this
  • Mark it as durable

Bug 1032975 - Add new table(s) to shipit database

New table(s) in update.db

Notes

Questions

  • Should we have 2 tables (release_status and release_data), or just 1 (release_status)?
  • Is release_status schema adequete?

To Do

  • Give rail the estimate for how much data will be added to the release_data table
  • Show sheeri the schema for help in optimizing columns, and to get other feedback once ready

Problems with Bug 1032978

  • When running test producer and consumer, when api is called, getting the following error:
build-tools)localhost:build-tools jzeller$ python buildfarm/release/shipit-notifier.py -c buildfarm/release/shipit-notifier.ini.example 
2014-08-11 14:37:37,400 listening for pulse messages
2014-08-11 14:37:37,427 Start from server, version: 0.9, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {u'exchange_exchange_bindings': True, u'consumer_cancel_notify': True, u'publisher_confirms': True, u'basic.nack': True}, u'platform': u'Erlang/OTP', u'version': u'2.6.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
2014-08-11 14:37:37,440 Open OK!
2014-08-11 14:37:37,440 using channel_id: 1
2014-08-11 14:37:37,446 Channel open
2014-08-11 14:37:39,794 msg received - release-mosilla-beta-firefox_tag
2014-08-11 14:37:39,794 adding new release event for Firefox-31.0b1-build1 with event_name release-mosilla-beta-firefox_tag
Traceback (most recent call last):
  File "buildfarm/release/shipit-notifier.py", line 115, in <module>
    main()
  File "buildfarm/release/shipit-notifier.py", line 112, in main
    pulse.listen()
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/mozillapulse/consumers.py", line 148, in listen
    self.connection.drain_events()
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/connection.py", line 275, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 91, in drain_events
    return connection.drain_events(**kwargs)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/amqp/connection.py", line 325, in drain_events
    return amqp_method(channel, args, content)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/amqp/channel.py", line 1908, in _basic_deliver
    fun(msg)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/messaging.py", line 592, in _receive_callback
    return on_m(message) if on_m else self.receive(decoded, message)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/messaging.py", line 559, in receive
    [callback(body, message) for callback in callbacks]
  File "buildfarm/release/shipit-notifier.py", line 91, in got_message
    receive_message(config, *args, **kwargs)
  File "buildfarm/release/shipit-notifier.py", line 74, in receive_message
    status_api.update(name, data=payload)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/buildtools-1.0.4-py2.7.egg/kickoff/api.py", line 117, in update
    url_template_vars={'name': name}).content
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/buildtools-1.0.4-py2.7.egg/kickoff/api.py", line 47, in request
    auth=self.auth)
TypeError: request() got an unexpected keyword argument 'config'
  • After deleting config=self.config in previous issue, got this error:
(build-tools)localhost:build-tools jzeller$ python buildfarm/release/shipit-notifier.py -c buildfarm/release/shipit-notifier.ini.example 
2014-08-11 14:54:01,065 listening for pulse messages
2014-08-11 14:54:01,095 Start from server, version: 0.9, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {u'exchange_exchange_bindings': True, u'consumer_cancel_notify': True, u'publisher_confirms': True, u'basic.nack': True}, u'platform': u'Erlang/OTP', u'version': u'2.6.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
2014-08-11 14:54:01,106 Open OK!
2014-08-11 14:54:01,107 using channel_id: 1
2014-08-11 14:54:01,112 Channel open
2014-08-11 14:54:05,748 msg received - release-mosilla-beta-firefox_tag
2014-08-11 14:54:05,748 adding new release event for Firefox-31.0b1-build1 with event_name release-mosilla-beta-firefox_tag
2014-08-11 14:54:05,775 Starting new HTTP connection (1): 127.0.0.1
2014-08-11 14:54:05,778 "HEAD /csrf_token HTTP/1.1" 200 0
2014-08-11 14:54:05,779 Request to http://127.0.0.1:5000/status/Firefox-31.0b1-build1
2014-08-11 14:54:05,779 Data sent: {'csrf_token': '20140811222405##82b09f99876eead53a695ae607c656733e2d4202', 'data': {u'event_name': u'release-mosilla-beta-firefox_tag', u'group': u'tag', u'results': 0, u'sent': u'2014-08-11T14:54:05+01:00'}}
2014-08-11 14:54:05,779 retry: Calling <bound method Session.request of <requests.sessions.Session object at 0x1077a85d0>> with args: (), kwargs: {'params': None, 'timeout': 60, 'url': u'http://127.0.0.1:5000/status/Firefox-31.0b1-build1', 'config': {'danger_mode': True}, 'data': {'csrf_token': '20140811222405##82b09f99876eead53a695ae607c656733e2d4202', 'data': {u'event_name': u'release-mosilla-beta-firefox_tag', u'group': u'tag', u'results': 0, u'sent': u'2014-08-11T14:54:05+01:00'}}, 'method': 'POST', 'auth': ('admin', 'password')}, attempt #1
Traceback (most recent call last):
  File "buildfarm/release/shipit-notifier.py", line 115, in <module>
    main()
  File "buildfarm/release/shipit-notifier.py", line 112, in main
    pulse.listen()
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/mozillapulse/consumers.py", line 148, in listen
    self.connection.drain_events()
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/connection.py", line 275, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 91, in drain_events
    return connection.drain_events(**kwargs)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/amqp/connection.py", line 325, in drain_events
    return amqp_method(channel, args, content)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/amqp/channel.py", line 1908, in _basic_deliver
    fun(msg)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/messaging.py", line 592, in _receive_callback
    return on_m(message) if on_m else self.receive(decoded, message)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/kombu/messaging.py", line 559, in receive
    [callback(body, message) for callback in callbacks]
  File "buildfarm/release/shipit-notifier.py", line 91, in got_message
    receive_message(config, *args, **kwargs)
  File "buildfarm/release/shipit-notifier.py", line 74, in receive_message
    status_api.update(name, data=payload)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/buildtools-1.0.4-py2.7.egg/kickoff/api.py", line 117, in update
    url_template_vars={'name': name}).content
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/buildtools-1.0.4-py2.7.egg/kickoff/api.py", line 59, in request
    auth=self.auth, params=params)
  File "/Users/jzeller/Mozilla/build-tools/lib/python2.7/site-packages/buildtools-1.0.4-py2.7.egg/util/retry.py", line 32, in retry
    return action(*args, **kwargs)
TypeError: request() got an unexpected keyword argument 'config'
  • May need to add default=None to the platform Column in model.py
  • Any reason to have a default set on results? How about for anything nullable?
  • Should we add enUSPlatforms to the ReleaseForm object?

Testing Staging-Release

  1. Edited config/mgerva.ini and saved as config/config.ini
  2. python repos_setup.py -c config/config.ini -b 1032978 -v 100 -r fennec -u jozeller
  3. scp -r staging-release/ jozeller@dev-master1.srv.releng.scl3.mozilla.com:
  4. python staging_setup.py -c config/config.ini -b 1032978 -v 100 -r fennec -u jozeller
  5. screen
  6. ctrl+a A "bash/SHIPIT"
    • cd /tmp/staging/release-kickoff
    • ./shipit.sh
  7. ctrl+a A "bash/BUILDBOT"
    • cd /tmp/staging/staging/
    • make start
  8. ctrl+a A "bash/RELEASERUNNER"
    • cd /tmp/staging/
    • ./release-runner.sh
      • Ran into some trouble so had to make the following changes
        • cp ~/config ~/.ssh/config; chmod 400 ~/.ssh/config

Questions

  • Can only store events with results=0 because otherwise (name, event_name) has duplicates when results!=0
  • status/<release_name> GET could send back a dict like {'Firefox-31.0b5-build1': [...]} rather then {'name': 'Firefox-31.0b5', 'data': [...]}
  • Do we want to go ahead and change all 'status' names to 'events'? ie status.html would be events.html
    • For some, but colloquially 'status' makes more sense then 'events' for user facing things
  • Make sure that repack_complete event messages are being counted as True even after all chunks are counted.
    • Simply counting chunks is not enough, this should still pass a True even without chunks
  • When submitting a build, when choosing ready, the page reloads to show the new build in the Reviewed tab, but none of the Running tab populates. It requires refreshing the entire page to see.
  • Include static copy of bootstrap?
    • Perhaps copy just the snippets I would like
  • Top level keys complete and progress could be 1 float key
  • When using a nonexistent releaseName for status/<releaseName> the GET does not fail, but instead sends a normal JSON response with an empty events value
  • Having to split the '+01:00' from sent on pulse messages in order to parse into datetime object for entry into the database. Better way? dateutil is not installed, so that would change dependencies, but from dateutil.parser import parse works great!
    • Going to just dump the +01:00 for now, but in the future it could be added by running sent = datetime.strptime('2014-08-08T16:12:59', "%Y-%m-%dT%H:%M:%S") + timedelta(+1)