TestEngineering/Services/LoopServerLoadTesting

From MozillaWiki
< TestEngineering‎ | Services
Revision as of 20:56, 16 June 2014 by Jbonacci (talk | contribs) (Created page with "== Summary for Loop Server, Loop Client, Mock Server, MSISDN Gateway == * Latest Results ** Link to loads cluster: https://loads.services.mozilla.com/ *** Note: this now requi...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Summary for Loop Server, Loop Client, Mock Server, MSISDN Gateway

  • Latest Results
    • Link to loads cluster: https://loads.services.mozilla.com/
      • Note: this now requires login privileges and a password
    • Snapshots from StackDriver - TBD
    • Snapshots from Kibana - TBD
    • Snapshots from Sentry - TBD
  • Latest Deployments
    • TBD
  • In Progress
    • Build out of Stage environments
    • Ongoing testing of Loop releases
    • Bug review and issue debug - there are a lot of issues to work on (see the long list near the bottom of the wiki)
  • Bugs To Verify:
    • TBD
  • Planned
    • Load Testing
    • Scaling for production traffic after release of Loop on Fx SOME VERSION
    • Focused testing on Loop Client for desktop
    • Focused testing on Loop Client for FxOS
  • Blockers
    • none at this time
  • Completed
    • None
  • Performance
    • TBD

Quick Verification Of Stage Deployments

  • This is a quick sanity test of the environment before getting started on load tests.
  • Loop Server
    • TBD
  • Loop Client
    • TBD
  • MSISDN Gateway
    • TBD
  • Mock Server
    • N/A

Load Test Tool Client/Host

Creating a RHEL AWS instance

  • Pick a Region then Create Instance > Launch Instance
  • Follow the prompts to create a basic, RHEL-flavored instance
  • Use of the QA/Dev key pairs that have been set up for this:
    • US East Key Pair: QA-Dev-Share (created by jbonacci) for general use
    • US West Key Pair: QA-dev-share (created by RaFromBRC) for general use
  • Once the instance is running, log in as "ec2-user"
  • The following apps, tools, and libs will need to be installed for use with various Services applications:
    • gcc, gcc-c++
    • hg
    • git
    • python-devel
    • automake, autoconf, and libtool (required for libzmq, for easy_install)
    • pip
    • virtualenv
    • node/npm
    • zeromq 3.X
    • gmp, gmp-devel
  • Also, general rhel updates:
$ sudo yum -y update
and/or
$ sudo yum -y upgrade
  • Now, the instance should be ready for installing and using the Loads tool.

Creating an Ubuntu AWS instance

  • Pick a Region then Create Instance > Launch Instance
  • Follow the prompts to create a basic, Ubuntu-flavored instance
  • Use of the QA/Dev key pairs that have been set up for this:
    • US East Key Pair: QA-Dev-Share (created by jbonacci) for general use
    • US West Key Pair: QA-dev-share (created by RaFromBRC) for general use
  • Once the instance is running, log in as "ubuntu"
  • The following apps, tools, and libs will need to be installed for use with various Services applications:
    • gcc, g++
    • mercurial
    • git
    • python-setuptools, python-virtualenv, and python-dev
    • automake, autoconf, libtool
    • m4
    • node/npm
    • libzmq and zeromq 3.X
    • gmp-5.1.3 or newer
  • Also, general rhel updates:
$ sudo apt-get update
and/or
$ sudo apt-get upgrade
  • Now, the instance should be ready for installing and using the Loads tool.

Installing Loop-Server and the Loads tool on the AWS instance

  • Installation:
TBD
  • Note: This will install a local copy of the Loads tool for use with the Loop-Server.

Running the load test against the Loop-Server in Stage

  • Stage environment:
$ make test
or
$ make test SERVER_URL=BLAH
$ make bench
or
$ make bench SERVER_URL=BLAH

Using the Loads Services Cluster for the Loop-Server in Stage

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=BLAH
  • Dev environment: TBD
  • Production environment: TBD

Installing MSISDN-Gateway and the Loads tool on the AWS instance

  • Installation:
TBD
  • Note: This will install a local copy of the Loads tool for use with MSISDN-Gateway.

Running the load test against MSISDN-Gateway in Stage

  • Stage environment:
$ make test SERVER_URL=BLAH
$ make bench SERVER_URL=BLAH

Using the Loads Services Cluster for the MSISDN-Gateway

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=BLAH
  • Dev environment: TBD
  • Production environment: TBD

Configuring The Load Tests

Test Coverage and Stats

  • Basic tweakable values for all load tests
    • users = number of concurrent users/agent
    • agents = number of agents out of the cluster, otherwise errors out
    • duration = in seconds
    • hits = 1 or X number of rounds/hits/iterations
  • Loop-Server
    • TBD
  • MSISDN-Gateway
    • TBD

Analyzing the Results

  • There are several methods and tools for analyzing the load test results.

Debugging the Issues

  • There are several methods and tools for debugging the load test errors and other issues.
  • 1. Important logs for Loop-Server (per server)
    • TBD
  • 2. Important logs for MSISDN-Gateway (per server)
    • TBD
  • Acceptable TokenServer errors:
1% - 2% failures (as the following)
token.log:
"name": "token.assertion.invalid_signature_error"
"name": "token.assertion.verify_failure"
nginx access.log:
401s
NOTE: Values can be tweaked here:
    https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L58-L60
Also, it may be the case that the following errors are "acceptable" if TS Stage is larger than Verifier Stage:
/media/ephemeral0/logs/tokenserver/token.error.log
Verifier-related errors of these types:
"HttpConnectionPool is full, discarding connection: verifier.stage.mozaws.net"
"Resetting dropped connection: verifier.stage.mozaws.net"
"Starting new HTTPS connection (179): verifier.stage.mozaws.net"
  • Acceptable Verifier errors:
In the verifier and squid logs:
References to mozilla.org and login.mozilla.org - part of the "invalid domain" tests
In the verifier logs:
References to https://secret.mozilla.com, which are defined in the browserid-verifier load test
https://github.com/mozilla/browserid-verifier/blob/master/loadtest/loadtest.py#L77 for example
  • Acceptable Sync node errors:
In the nginx access.log files:
We will see some percentage of 404s. Right now we see the following:
    14% 404s (compared to the total count of 200s)
    with the config set up as follows:
         users = 20
         duration = 1800
         agents = 5
Ideally, the overall percentage of 404s should drop the longer the load test.

Monitoring TS and Sync Stage

./bin/loads-runner --ping-broker --ssh=ubuntu@loads.services.mozilla.com
./bin/loads-runner --check-cluster --ssh=ubuntu@loads.services.mozilla.com

Performance Testing Information

  • TBD

Details on the Load Test tool

Known Bugs, Issues, and Tasks

    • Loop Server: TBD
    • Loop Client: TBD
    • MSISDN Gateway: TBD
    • Mock Server: TBD
  • OPs and Infrastructure
    • TBD

References

  • Repositories
  • Documentation
  • OPs pages for stats collection, logging, monitoring
    • TBD