QA/BrowserID/OPsBuildOut

From MozillaWiki
< QA‎ | BrowserID
Jump to: navigation, search

Back To ==> Mozilla BT Services

Intro

This is the main QA site for the BrowserID OPs Build-Out. All tasks, projects, status, and other important information related to the Build-Out will be listed here and updated frequently.

  • Overall/Master Schedule (details): to be added
  • Build-Out Schedule per four environments: November/December
  • Deployment Schedule per four environments: November/December
  • Load Test Schedule per four environments: November/December
  • QA Test Schedule per four environments: November/December

QA

Task/Project Owner Status Notes, Links, Tickets, Issues
Track Test/CI build-out jbonacci Planned
Track new "cross-team env" build-out jbonacci Planned
  • This will be a VM environment like old Beta
  • This VM env is to be shared by QA teams for BrowserID, AMO, Soup, etc.


Stability testing Test/CI env QA Planned
  • need from with Dev and OPs
Stability testing new "cross-team env" QA Planned
  • need from with Dev and OPs
  • This will be a VM environment like old Beta
Track OPs BrowserID build-out planning and HW orders jbonacci completed
  • see Bug 697477 - fixed
BrowserID arch review jbonacci completed
Track RPM and deployment work in Dev jbonacci completed
  • see Bug 695596
Track Dev build-out jbonacci completed
  • see Bug 695957 - fixed
Track Stage build-out jbonacci completed
  • see Bugs 695941 (fixed), 695944 - fixed, 703497
Track Prod build-out jbonacci completed
  • see Bugs 695941 (fixed), 695943, 695946, 695955, 695956 (fixed), 695960, 695972, 698044, 703497
Track DEV load test development jbonacci completed
  • see Issue 504
Track OPs load test deployment jbonacci completed
  • see Bug 695952
Track OPs dashboard/monitoring work jbonacci completed
  • see Bugs 695961, 695963, 695967
Unit test tutorial/wiki jbonacci completed
Stability testing Dev env QA completed
Stability testing Stage env QA completed
Track Load testing Stage env QA/OPs completed
  • none
Stability testing Prod env QA completed
  • none
Load testing Prod env QA/Dev/OPs completed
  • none
QA node/npm "easy" installs jbonacci completed
Discussion of Android compatibility for BrowserID QA/Dev completed
  • 2.2, 2.3, 3.x and newer vs. supporting older versions
  • Percentage-wise (of users) 2.2 and newer seems ok
Final QA wiki updates jbonacci completed



Dev

Task/Project Owner Status Notes, Links, Tickets, Issues
Initial plan for Scaling to 1 million users Dev team completed
Metrics dashboard Dev team completed
  • see Bug 690107
Maintain "all-in-one" BrowserID lloyd completed
  • Git and RPM methods are always available
GitHub repository re-organization lloyd completed
  • Issue 503: improve repository organization
Better logging lloyd completed
  • GitHub Issues 119, 537, 541, 536, 530, 529, etc.
  • Dev needs to work with OPs to get *-error logs, archiving, and log rotation
Process splitting: browserid, verifier, keysigner, dbwriter lloyd completed
  • GitHub Issue 460
Second client (dev.)(beta.)myfavoritebooze.org lloyd completed
  • support 'keep me signed in' feature
Developer Engagement Plan Dev/PM teams completed
Crypto discussions and work Dev/Security teams completed
  • GitHub issues: various
Work with OPs: RPM creation for deployment lloyd/petef completed
  • Bug 695956 - fixed
Work with OPs: Dev deployment lloyd/petef completed
  • Bug 695957 - fixed
Work with OPs: initial Dev test lloyd/petef completed
  • See bullets below
Work with OPs: code for Stage lloyd/petef completed
  • Important open issues for Dev and Stage:
  • 504: repair load_gen
  • 560: move bcrypt to webhead
  • 561: handle mysql master failover in dbwriter process
  • 566: heartbeat should not check backend servers
  • 582: improved file based configuration
Work with OPs: load test code and deploy lloyd/petef completed
Update GitHub Scaling to 1M lloyd completed
Use of node.js v0.6.x lloyd, benadida completed
  • none
Unit tests, headless front-end tests stomlinson, bendadida completed
  • none


OPs

Task/Project Owner Status Notes, Links, Tickets, Issues
Test/CI environment deploy and test petef Planned
  • Bug 695958
  • http://hudson.build.mtv1.svc.mozilla.com/job/browserid-server
  • Need new DEV public, so just need to get a public VIP/VLAN on our dev/test zeus cluster after that and auto deployment, old dev and move to new dev new dev --> dev.diresworb.org (or equivalent)
  • Request sent to NetOPs for external IPs for Dev and Test
  • Goal is to have Dev, Test/CI, and Prod up and running and tested by QA by 12/9
  • Stage to be up in same time period, tested, with the addition of load testing
New "cross-team env" VM deploy and test lloyd/petef Planned
  • Need an OPs bug for this
  • This Env to be shared by various teams: BrowserID, Soup, AMO, etc.
Jenkins/Hudson for BrowserID-Server petef completed
Build-out planning OPs completed
VM setup/config for Dev and Test/CI petef completed
HW orders for Prod:SCL2 OPs completed
HW orders for Prod:PHX1 OPs completed
HW orders for Stage OPs completed
Logging petef completed
  • Bugs 695916, 695963, 695967
  • /var/log/browserid on webheads, secure webheads, keysigners
  • /var/log/zeus on Zeus servers
  • Still need log archiving and rotation
  • Still need *-error logs
  • TBD for database
Metrics petef completed
  • Bugs 695916, 695963, 695967
RPM design/build/test petef/lloyd completed
  • Bug 695956 - fixed
  • Latest pull request was 11/14
Set up Load test env petef completed
  • Bug 695952
Monitoring/Dashboards petef completed
  • All in Graphite/Pencil
Zeus configuration petef/atoll completed
  • Bugs 695943, 705922, 705922, 706408, 706410, 706411, 704638, 708321
MySQL petef completed
  • Bug 695972
  • mysql replication/monitoring strategy fleshed out (doc coming)
  • DB/MySQL failover work, Bug 695972
Initial Dev environment deploy and test petef/lloyd completed
  • Bug 695957 - fixed
  • 11/7-11/14 DEV env build-out and test deployments
  • First official deploymant: train-2011.11.17
  • Dev Server: https://dev-browserid.services.mozilla.com
  • Dev RP (temp): http://beer.mtv1.dev.svc.mozilla.com
  • OPs needs to configure persistence on the test beer site otherwise beer names are not sticking across sessions
  • Need to see what happens on Stage (once it is up and running again)
  • OPs needs to puppetize and config mongodb on Dev, so my guess is Stage will need this also
  • Request sent to NetOPs for external IPs for Dev and Test
  • Goal is to have Dev, Test/CI, and Prod up and running and tested by QA by 12/9
  • Stage to be up in same time period, tested, with the addition of load testing
Initial Stage environment deploy and test petef/lloyd completed
  • Bug 695946, 703497, 695944 - fixed
  • First official deploymant: train-2011.11.17
  • Stage Server: https://stage-browserid.services.mozilla.com
  • Stage RP (temp): http://carrera.databits.net:9999
  • Delayed from last week: dbwriter, networking and our new load balancers
  • Stage still needs log rotation and archiving
  • Goal is to have Dev, Test/CI, and Prod up and running and tested by QA by 12/9
  • Stage to be up in same time period, tested, with the addition of load testing
Stage environment load test petef/QA completed
  • Planned: benchmarks and performance in Stage.
  • QA needs more detailed instructions for running load_gen off of the "client" load generators.
Prod environment deploy, test, and load test petef completed
  • Bugs 695946, 695960, 698044, 705922, 703497
  • Goal is to have Dev, Test/CI, and Prod up and running and tested by QA by 12/9
  • Stage to be up in same time period, tested, with the addition of load testing
Other: working with infra-sec on documentation, vuln scans, more reliable CEF logging, heartbeats OPs completed
  • none
Puppet configuration (transferable to Stage) OPs completed
  • Bug 695956 - fixed
RSBAC, crypto, ACLs work OPs completed
  • Bug 695955
  • Completed on Dev
  • Planned for Stage and then Prod


Quick Summaries/History: July - Oct

QA:

  • Focus for Summer and Fall
    • Weekly deployments to Beta environment
    • Verification of deployments to current Production environment
    • Test Plan and Weekly Trains wikis
  • Tracking Dev activity for scaling to one million users
  • Tracking OPs activity for BrowserID environment build-outs

DEV:

  • Scalability planning started in the Summer.
  • Planning path for security review and production build-out was started in the Summer.
  • Metrics dashboard for identity
  • Code migration to MySQL
  • Command-line load generation and performance analysis
  • Debug improvements
  • Other areas shared with OPs: monitoring, logging improvements, RPM generation script, schema changes to improve database performance and scalability

OPs:

  • BrowserID env designs
  • Capacity planning
  • Hardware was ordered in October
  • Dev environment planning, build-out and configuration was started
  • RPM generation work was started
  • Other areas of work: node cluster, mysql strategy, monitoring strategy, logging, DB performance, scalability, zeus vips setup, nginx/apache, webhead/secure webhead planning
  • Dependent on Dev for keysigner design and split, dbwriter design and split, logging improvements, repository restructuring, process split, code integration for packaging, and the load-gen code
  • REF: Bug 695940 - BrowserID production tracking bug

Important Links


OPs Tickets

  • Bug 695940 - BrowserID production tracking bug
    • This is a meta bug for all the other OPs tickets


Other Tickets

  • Bug 690107 - Build dashboards for identity
  • Bug 644776 - Security Review For BrowserID
  • Bug 692247 - Accessibility review for revised design of BrowserID
  • Bug 694073 - Need socketlabs SMTP parameters for browserid
  • Bug 650863 - SSL cert needed for browserid.org
  • Bug 695955 - browserid: write rsbac security policies for all host types
  • Bug 705023 - Set browserid's user UID to something lower than 500
  • Bug 705033 - Change the way nodejs starts browserid (no functionality change)
  • Bug 705922 - zlb1/2/3.pub.scl2.svc.m.c: please kickstart, yum update, reboot, puppetize
  • Bug 706408 - monitor zeus log sync queue size
  • Bug 706410 - deploy zeus log sync ssh key to zlb*.pub.scl2.svc
  • Bug 706411 - fix zeus log sync script to report rsync errors via cronmail
  • Bug 704638 - deploy zlb*.pub.scl2

GitHub Issues


BrowserID Environments

See the main OPs BrowserID site for full details:


Load Testing

  • Load_Gen code and test - Dev work is still in progress
  • See the following bug and issue:
    • Bug 695952 - browserid: loadtest
    • Issue 504: repair load_gen
  • Load Test hardware (per OPs)
    • There are eight dedicated client machines to send load
      • Need machine name information
    • There is also a dedicated VLAN that does not hit the firewall to push more load, if required.
  • Load Test information per four environments: TBD


Dashboards and Monitoring


Bugs vs. GitHub Issues

  • TBD


Back To ==> Mozilla BT Services