QA/Execution/Web Testing/Release Checklist

From MozillaWiki
Jump to: navigation, search

(rough draft)

New Sites

  1. Has the site been staged on a sanctioned IT server?
  2. Was there a sufficient amount of/the right kind of data to test fixes/features?
  3. Has QA verified as many bugs as possible?

Staging Sites / Pre-release

  1. For the remaining Resolved FIXED bugs, have the project owners signed off on the potential risks/unknowns?
  2. Has security reviewed/tested the app/site/new features?
  3. Was any load or performance-testing needed/done?
  4. Are there SQL/data-migration scripts to be run? Have they been run on staging?
  5. Are staging and production using the same load-balancers/caching infrastructure (Zeus/Netscaler)?
  6. Do the MIME types match between staging and production?
  7. Code-wise (repositories), will/do staging and production match with the latest revision #s? i.e. svn status/info/whatever Github does (rebase master?)
  8. Is the push on IT's radar, with a set push time/outage window (if needed), and are all involved parties aware? (Hopefully all are by this step!)
  9. Is all of the above pertinent info in the push bug, with a chronological sequence of steps to follow (installation of packages, enabling of services, etc.)?

Push / Release

  1. Do the slave/master and data-replication architectures/setup match?
  2. Is Nagios/heartbeat monitoring on staging/prod?
  3. Is Puppet/configuration management enabled/working correctly?
  4. Do the system libraries/packages match up version-wise, between staging and prod, if a pre-existing site?
    1. (Right branch/tag?)
  5. Is there a current and verified backup of all relevant (user/app) data, in case of a rollback?
  6. Is there a read-only mode, and does it need to be enabled during the push?
    1. If so, does there need to be a user-facing message indicating so?
  7. If there's no read-only mode, is there an outage page set up in advance, ready to go?
  8. Are all redirects properly set up/synched?
  9. Are all neccessary cron jobs set up and running?
  10. Are all services running/modules enabled (i.e. SMTP, Celery, Memcached, Redis)?
  11. Is error logging (e.g. tracebacks, etc.) on all relevant services (Apache, app, database, etc.) enabled (with the right permissions?) and logging correctly?
  12. Are any other releases/upgrades/outages planned, which might impact this one?
  13. If the push is exceeding the downtime window, have the right people/aliases been contacted, with a reasonable ETA for recovery/fix?
  14. Is there a documented contingency plan?
    1. Who makes the call to roll back?

Notes / Next Steps

  • Split the tasks / checks up by teams
  • Talk to Mike Alexis about improving the project initiation form
  • Development / QA comes up with a checklist for each project (assigns owners)
  • Implement what we have in AMO / Socorro for other projects
  • Pre-release meeting, toss out the steps the project doesn't need