TestEngineering/Services/LoopServerLoadTesting: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 317: Line 317:
** https://wiki.mozilla.org/QA/Services/LoopTestEnvironments#MSISDN_Gateway_Server_Stage_Details
** https://wiki.mozilla.org/QA/Services/LoopTestEnvironments#MSISDN_Gateway_Server_Stage_Details
** https://wiki.mozilla.org/QA/Services/LoopTestEnvironments#MSISDN_Mock_Server_Stage_Details
** https://wiki.mozilla.org/QA/Services/LoopTestEnvironments#MSISDN_Mock_Server_Stage_Details
** https://wiki.mozilla.org/QA/Services/LoopTestEnvironments#Loads_Services_Cluster_Environment
** https://wiki.mozilla.org/QA/Services/LoadsToolsAndTesting1
** https://github.com/mozilla/browserid-verifier/tree/master/loadtest
** https://github.com/mozilla/browserid-verifier/tree/master/loadtest



Revision as of 00:24, 17 September 2014

Summary for Loop Server, Loop Client, MSISDN Gateway

Quick Verification Of Stage Deployments

  • This is a quick sanity test of the environment before getting started on load tests.
  • Loop Server
For now, just run a quick loadtest 'make test'
cd loop-server
cd loadtests
make test SERVER_URL=https://loop.stage.mozaws.net
  • Loop Client
Check https://call.stage.mozaws.net/config.js
Should return json similar to the following:
var loop = loop || {};
loop.config = {serverUrl: 'https://loop.stage.mozaws.net'};
and
WIP from the client team: end to end tests
Also:
https://call.stage.mozaws.net
curl https://call.stage.mozaws.net
curl -I https://call.stage.mozaws.net
  • MSISDN Gateway
In the browser: https://msisdn.stage.mozaws.net
or do the following from a command line:
curl https://msisdn.stage.mozaws.net
curl -I https://msisdn.stage.mozaws.net
and
For now, just run a quick loadtest 'make test'
cd msisdn-gateway
cd loadtests
make test SERVER_URL=https://msisdn.stage.mozaws.net
and
WIP using the following tools:
CLI: https://github.com/mozilla-services/msisdn-gateway/tree/master/tools/roundTrip
Web app: http://mozilla-services.github.io/msisdn-verifier-client/
    based on the this repo: https://github.com/mozilla-services/msisdn-verifier-client

Quick Verification of Production Deployments

  • This is a quick sanity test of the environment for after each Production deployment.
  • Loop Server
In the browser: https://loop.services.mozilla.com
or do the following from a command line:
curl https://loop.services.mozilla.com
curl -I https://loop.services.mozilla.com

Then run a few 'make test' commands from the loadtests folder:
make test SERVER_URL=https://loop.services.mozilla.com
Note: this does hit a live third-party server

Then perform actual loop testing via desktop (Aurora/Nightly so far) and FxOS (2.1)
Verify that requests and strings point to Production environments
  • Loop Client
In the browser: https://call.mozilla.com
should return "Welcome to the Loop web client."

In the browser: https://call.mozilla.com/config.js
should return json similar to the following:
var loop = loop || {};
loop.config = {serverUrl: 'https://loop.services.mozilla.com'};

In the browser: https://call.mozilla.com/VERSION.txt
should return the version and build string info

curl https://call.mozilla.com
curl -I https://call.mozilla.com

Quick end-to-end tests:
Desktop: browser to browser
Desktop to FxOS
FxOS to Desktop
Two FxOS devices
  • MSISDN Gateway
In the browser: https://msisdn.services.mozilla.com
or do the following from a command line:
curl https://msisdn.services.mozilla.com
curl -I https://msisdn.services.mozilla.com
Or
Run a single 'make test' command from the loadtests folder:
make test SERVER_URL=https://msisdn.services.mozilla.com
Note: this does hit a live third-party, so limit the check to a single run.
and
WIP using the following tools:
CLI: https://github.com/mozilla-services/msisdn-gateway/tree/master/tools/roundTrip
Web app: http://mozilla-services.github.io/msisdn-verifier-client/
    based on the this repo: https://github.com/mozilla-services/msisdn-verifier-client

Load Test Tool Client/Host

Installing Loop-Server and the Loads tool on Localhost or AWS

  • Installation:
git clone https://github.com/mozilla-services/loop-server.git
cd loop-server
npm install
ulimit -S -n 2048
npm test *
cd loadtests
make build
make test

Coverage report can be found here:
/loop-server/coverage/lcov-report/index.html

* This step requires the redis server to be installed and running:
Mac:
brew install redis
redis-server /usr/local/etc/redis.conf

Ubuntu Linux:
sudo apt-get install redis-server
sudo /usr/bin/redis-server /etc/redis/redis.conf
sudo tail -f /var/log/redis/redis-server.log

RHEL Linux:
Install redis from here: http://download.redis.io/releases
then
/usr/local/bin/redis-server /home/ec2-user/redis-2.8.9/redis.conf
or similar

  • Note: This will install a local copy of the Loads tool for use with the Loop-Server.

Running the load test against the Loop-Server in Stage

  • Stage environment:
$ cd loop-server/loadtests
$ make test
or
$ make test SERVER_URL=https://loop.stage.mozaws.net
$ make bench
or
$ make bench SERVER_URL=https://loop.stage.mozaws.net

Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost.    
The recommendation is to use 'make test' and 'make megabench' instead (see below)...
  • To hit the partner test servers, the following configuration file will need to be updated by OPs:
    • /data/loop-server/config/settings.json
  • Talk to OPs to toggle that configuration file and restart the Loop-Server in Stage.

Using the Loads V1 Services Cluster for the Loop-Server in Stage

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://loop.stage.mozaws.net
  • To hit the partner test servers, the following configuration file will need to be updated by OPs:
    • /data/loop-server/config/settings.json
  • Talk to OPs to toggle that configuration file and restart the Loop-Server in Stage.

Installing MSISDN-Gateway and the Loads tool on Localhost or AWS

  • Installation:
    • Install gmp, gmp-dev or gmp-devel
    • Install ruby (very latest), ruby-dev or ruby-devel
    • Install gem (required for fake_dynamo)
    • Verify that gem is in your path
    • Install redis-server to run the unit tests
  • To install gmp
sudo yum -y install gmp, gmp-devel
or for Ubuntu
$ wget https://ftp.gnu.org/gnu/gmp/gmp-6.0.0a.tar.bz2
$ tar xvjf gmp-6.0.0a.tar.bz2
$ cd gmp-6.0.0
$ ./configure --prefix=/usr
$ make
$ make check
$ sudo make install
  • To install ruby:
sudo yum -y install ruby, ruby-devel
or
sudo apt-get install ruby, ruby-dev

If this does not get you 1.9.3 or newer, then install manually:
Example:
    $ wget http://cache.ruby-lang.org/pub/ruby/1.9/ruby-1.9.3-p547.tar.gz
    $ ./configure --prefix=/usr
    $ make
    $ sudo make install
    (because for rhel, the default ruby version is 1.8.x.)
REF:
Main: https://www.ruby-lang.org/en/downloads/
Dev Tools: http://rubyinstaller.org/downloads/ 
  • To install gem:
Grab rubygems from here: http://rubygems.org/pages/download
cd to rubygems directory
$ sudo ruby setup.rb
  • To install fake_dynamo:
You should not have to install fake_dynamo since it is now part of the repo installer.
But if you do:
$ sudo gem install fake_dynamo
REF: https://github.com/ananthakumaran/fake_dynamo
  • Install the msisdn-gateway repo:
$ git clone https://github.com/mozilla-services/msisdn-gateway.git
$ cd msisdn-gateway
$ sudo make install
(There is a bug open about the requirement to install with 'sudo')
  • Note: This will install a local copy of the Loads tool for use with MSISDN-Gateway.
  • Unit testing
Get redis-server installed
Start the server in a separate terminal or in the background with logging active
$ make test
The coverage report is here: msisdn-gateway/coverage/lcov-report/index.html

Running the load test against MSISDN-Gateway in Stage

  • Building the load tests
$ cd loadtests
$ make build
  • To load test the Stage environment:
$ make test SERVER_URL=https://msisdn.stage.mozaws.net
$ make bench SERVER_URL=https://msisdn.stage.mozaws.net

Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost.    
The recommendation is to use 'make test' and 'make megabench' instead (see below)...
  • This environment also contains its own mock server: http://omxen.dev.mozaws.net
  • The configuration file on the Stage server: /data/msisdn-gateway/config/production.json

Using the Loads V1 Services Cluster for the MSISDN-Gateway

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://msisdn.stage.mozaws.net
  • This environment also contains its own mock server: http://omxen.dev.mozaws.net
  • The configuration file on the Stage server: /data/msisdn-gateway/config/production.json

Configuring The Load Tests

  • Makefile
    • The SERVER_URL constant can be changed.
  • Config files
    • For make test (Loop-Server and MSISDN-Gateway):
      • Number of hits
      • Number of concurrent users
    • For make bench (Loop-Server and MSISDN-Gateway):
      • Number of concurrent users
      • Duration of test
    • For make megabench (Loop-Server and MSISDN-Gateway):
      • Number of concurrent users
      • Duration of test
      • Include file (this is code dependent)
      • Python dependencies (this is code dependent)
      • Broker to use for testing (leaves as defined for now - this is broker in the Loads Cluster)
      • Agents to use for testing (default is 5, max is currently 20, but depends on the number of concurrent load tests running)
      • Detach mode (leave as defined for now to automatically detach from the load test once it starts on the localhost)
      • Observer (this can be email or irc - the default is irc #services-dev channel)
  • Loop-Server load test code
    • The Loop-Server load test can not currently be configured in the code

Test Coverage and Stats

  • Basic tweakable values for all load tests
    • users = number of concurrent users/agent
    • agents = number of agents out of the cluster, otherwise errors out
    • duration = in seconds
    • hits = 1 or X number of rounds/hits/iterations
  • Loop-Server
    • TBD
  • MSISDN-Gateway
    • TBD

Analyzing the Results

  • There are several methods and tools for analyzing the load test results.
  • Loop-Server Custom Metrics
    • Opened web sockets
    • Total web sockets
    • Bytes/websockets
    • addFailure (from the loads tool/client)
  • MSISDN-Gateway Custom Metrics
    • mt-flow
    • ask-for-certificate
    • try-wrong-code
    • try-right-code
    • momt-flow
    • omxen-message-collision
    • register
    • unregister
    • addFailure (from the loads tool/client)

Debugging the Issues

  • There are several methods and tools for debugging the load test errors and other issues.
  • 1. Important logs for Loop-Server (per server)
    • /var/log/circus.log
    • /var/log/loop_err.log
    • /var/log/loop_out.log
    • /var/log/hekad/loop.stdout.log
    • /var/log/hekad/loop.stderr.log
    • /var/log/nginx/access.log
    • /var/log/nginx/error.log
  • 2. Important logs for MSISDN-Gateway (per server)
    • TBD
  • Acceptable/Unacceptable Loop-Server errors:
hekad loop.stderr.log
The following are acceptable:
Decoder 'LoopServer-LoopServerDecoder' error: Failed parsing
Plugin 'AggregatorOutput' error: writing to heka.shared....

nginx logs:
Some percentage of 200s, 204s, and 404s is acceptable. Some of the 404s are actually bot/spam 
activity in the /media/ephemeral0/nginx/logs/loop_server.access.log and
/media/ephemeral0/circus/loop_server/loop_server.out.log logs.
Any percentage of 405s, 502s, or 503s is not acceptable.

/var/log/loop_err.log
The following are acceptable: connect: res.on("header"): use on-headers module directly

In the Loads Cluster dashboard, watch for the following errors/failures:
string indices must be integers
OR
No JSON object could be decoded
OR
'hawk-session-token'
  • Acceptable/Unacceptable MSISDN-Gateway errors:
The updated load test does generate a certain percentage of errors:
https://github.com/mozilla-services/msisdn-gateway/blob/master/loadtests/loadtest.py#L19-L22
So, expect to see a predefined percentage of 204s and 400s, along with the usual 200s in the nginx access logs.
The msisdn-gateway app logs should be clean with just msisdn and test data.

Monitoring Loop Stage

Agents statuses
Launch a health check on all agents

Performance Testing Information

  • TBD

Details on the Load Test tool

Known Bugs, Issues, and Tasks

References

  • OPs pages for stats collection, logging, monitoring
    • TBD