TestEngineering/Services/TokenServerAndSyncLoadTesting: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 36: Line 36:
== Quick Verification Of Stage Deployments ==
== Quick Verification Of Stage Deployments ==
* This is a quick sanity test of the environment before getting started on load tests.
* This is a quick sanity test of the environment before getting started on load tests.
* TokenServer Stage environment: TBD
* TokenServer Stage environment:
  For now, just use the simple "make test" or "make bench" command from an install of tokenserver
  Use the simple "make test" command from an install of tokenserver on the localhost or AWS instance.
on the localhost or AWS instance.
cd loadtest
* Verifier Stage environment: TBD
make test SERVER_URL=https://token.stage.mozaws.net
* Verifier Stage environment:
  Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance.
  Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance.
  cd loadtest
  cd loadtest
Line 66: Line 67:
** See the following wiki page for more information: https://wiki.mozilla.org/User_Services/Sync/Run_TPS
** See the following wiki page for more information: https://wiki.mozilla.org/User_Services/Sync/Run_TPS
** See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1006675
** See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1006675
== Quick Verification Of Production Deployments ==
* This is a quick sanity test of the environment after a new deployment.
* Tokenserver Production Environment
TBD
* Verifier Production Environment
Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance.
cd loadtest
make test SERVER_URL=https://verifier.accounts.firefox.com
* Sync Server Stage environment
TBD


== Load Test Tool Client/Host ==
== Load Test Tool Client/Host ==

Revision as of 23:53, 30 June 2014

Summary for Tokenserver, Verifier and Sync 1.5

Quick Verification Of Stage Deployments

  • This is a quick sanity test of the environment before getting started on load tests.
  • TokenServer Stage environment:
Use the simple "make test" command from an install of tokenserver on the localhost or AWS instance.
cd loadtest
make test SERVER_URL=https://token.stage.mozaws.net
  • Verifier Stage environment:
Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance.
cd loadtest
make test SERVER_URL=https://verifier.stage.mozaws.net
  • Sync Server Stage environment:
Install server-syncstorage to the local host or AWS instance (see below)
$ cd server-syncstorage
Quick test against the TokenServer
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py --use-token-server <Stage TokenServer>
Current example:
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py --use-token-server 
    https://token.stage.mozaws.net/1.0/sync/1.5
Quick tests against the Sync nodes
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py <Stage Sync Node>#<Node Secret>
Current examples:
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py 
    https://sync-1-us-east-1.stage.mozaws.net#<Node Secret>
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py 
    https://sync-1-us-east-1.stage.mozaws.net#<Node Secret>
$ ./local/bin/python ./syncstorage/tests/functional/test_storage.py 
    https://sync-1-us-east-1.stage.mozaws.net#<Node Secret>
Get the Node Secret information from OPs

Quick Verification Of Production Deployments

  • This is a quick sanity test of the environment after a new deployment.
  • Tokenserver Production Environment
TBD
  • Verifier Production Environment
Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance.
cd loadtest
make test SERVER_URL=https://verifier.accounts.firefox.com
  • Sync Server Stage environment
TBD

Load Test Tool Client/Host

Creating a RHEL AWS instance

  • Pick a Region then Create Instance > Launch Instance
  • Follow the prompts to create a basic, RHEL-flavored instance
  • Use of the QA/Dev key pairs that have been set up for this:
    • US East Key Pair: QA-Dev-Share (created by jbonacci) for general use
    • US West Key Pair: QA-dev-share (created by RaFromBRC) for general use
  • Once the instance is running, log in as "ec2-user"
  • The following apps, tools, and libs will need to be installed for use with various Services applications:
    • gcc, gcc-c++
    • hg
    • git
    • python-devel
    • automake, autoconf, and libtool (required for libzmq, for easy_install)
    • pip
    • virtualenv
    • node/npm
    • zeromq 3.X
    • gmp, gmp-devel
  • Also, general rhel updates:
$ sudo yum -y update
and/or
$ sudo yum -y upgrade
  • Now, the instance should be ready for installing and using the Loads tool.

Creating an Ubuntu AWS instance

  • Pick a Region then Create Instance > Launch Instance
  • Follow the prompts to create a basic, Ubuntu-flavored instance
  • Use of the QA/Dev key pairs that have been set up for this:
    • US East Key Pair: QA-Dev-Share (created by jbonacci) for general use
    • US West Key Pair: QA-dev-share (created by RaFromBRC) for general use
  • Once the instance is running, log in as "ubuntu"
  • The following apps, tools, and libs will need to be installed for use with various Services applications:
    • gcc, g++
    • mercurial
    • git
    • python-setuptools, python-virtualenv, and python-dev
    • automake, autoconf, libtool
    • m4
    • node/npm
    • libzmq and zeromq 3.X
    • gmp-5.1.3 or newer
  • Also, general rhel updates:
$ sudo apt-get update
and/or
$ sudo apt-get upgrade
  • Now, the instance should be ready for installing and using the Loads tool.

Installing BrowserID-Verifier and the Loads tool on the AWS instance

  • Installation:
$ git clone git://github.com/mozilla/browserid-verifier
$ cd browserid-verifier
$ npm install
$ npm test
$ cd loadtest
$ make build
     Note: This should hit Stage by default: SERVER_URL=https://verifier.stage.mozaws.net
  • Note: This will install a local copy of the Loads tool for use with the verifier.

Running the load test against the Verifier in Stage

  • Stage environment:
$ make test
or
$ make test SERVER_URL=https://verifier.stage.mozaws.net
$ make bench
or
$ make bench SERVER_URL=https://verifier.stage.mozaws.net	
NOTE: The URL for Stage environment will most likely change on a frequent basis.
NOTE:  NOTE: This also hits the Stage mockmyid server.
  • And while we are at it...
  • Dev environment:
$ make test SERVER_URL=TBD
$ make bench SERVER_URL=TBD
  • Production environment:
$ make test SERVER_URL=https://verifier.accounts.firefox.com
$ make bench SERVER_URL=https://verifier.accounts.firefox.com

Using the Loads Services Cluster for the Verifier

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://verifier.stage.mozaws.net
  • Dev environment:
$ make megabench SERVER_URL=TBD
  • Production environment:
$ make megabench SERVER_URL=https://verifier.accounts.firefox.com

Installing TokenServer and the Loads tool on the AWS instance

  • Installation:
$ git clone https://github.com/mozilla-services/tokenserver
$ cd tokenserver
$ make build
$ make test
    Note: This is for local testing only
$ cd loadtest
$ make build
    Note: This should hit Prod by default: SERVER_URL=https://token.services.mozilla.com
  • Note: This will install a local copy of the Loads tool for use with TokenServer.

Running the load test against TokenServer in Stage

  • Stage environment:
$ make test SERVER_URL=https://token.stage.mozaws.net
$ make bench SERVER_URL=https://token.stage.mozaws.net		
NOTE: The URL for Stage environment will most likely change on a frequent basis.
NOTE: This also hits the Stage Verifier, which in turns hits the Stage mockmyid server
  • And while we are at it...
  • Dev environment:
$ make test SERVER_URL=https://token.dev.lcip.org
$ make bench SERVER_URL=https://token.dev.lcip.org
  • Production environment:
$ make test SERVER_URL=https://token.services.mozilla.com
$ make bench SERVER_URL=https://token.services.mozilla.com

Using the Loads Services Cluster for TokenServer

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://token.stage.mozaws.net
  • Dev environment:
$ make megabench SERVER_URL=https://token.dev.lcip.org
  • Production environment:
$ make megabench SERVER_URL=https://token.services.mozilla.com

Installing Sync 1.5 and the Loads tool on the AWS instance

Installation:
$ git clone https://github.com/mozilla-services/server-syncstorage/
$ cd server-syncstorage
$ make build
$ make test
$ cd loadtest
$ make build
  • Note: This will install a local copy of the Loads tool for use with Sync 1.5.

Running the load test against Sync 1.5 in Stage

  • Loads against specific Sync nodes in Stage
$ make test SERVER_URL=https://your.storagenode.here#SECRET
$ make bench SERVER_URL=https://your.storagenode.here#SECRET
Sync Stage nodes:
    https://sync-1-us-east-1.stage.mozaws.net
    https://sync-2-us-east-1.stage.mozaws.net
    https://sync-3-us-east-1.stage.mozaws.net

NOTE: The Stage sync nodes are likely to change frequently, so verify the URLs.
    See https://wiki.mozilla.org/QA/Services/FxATestEnvironments#Sync_1.5_Stage_Environment

NOTE: The OPs team has the SECRET string for Stage. Get it from them before you start testing.

Using the Loads Services Cluster for Sync 1.5 in Stage

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://your.storagenode.here#SECRET

Running a combined load test against TokenServer and Sync 1.5 in Stage

  • A combined loads test against TokenServer and Sync 1.5 in Stage
  • This is done via the server-syncstorage directory that was cloned and built above
$ cd server-syncstorage
$ cd loadtest
$ make test SERVER_URL=https://your.tokenserver.here
$ make bench SERVER_URL=https://your.tokenserver.here

Examples for Stage:
$ make test SERVER_URL=https://token.stage.mozaws.net
$ make bench SERVER_URL=https://token.stage.mozaws.net
See https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer_Stage_Environment
  • And while we are at it...
Dev environment:
Examples:
$ make test SERVER_URL=https://token.dev.lcip.org
$ make bench SERVER_URL=https://token.dev.lcip.org

Prod environment:
Examples:
$ make test SERVER_URL=https://token.services.mozilla.com
$ make bench SERVER_URL=https://token.services.mozilla.com

See https://wiki.mozilla.org/QA/Services/FxATestEnvironments#FxA.2C_TokenServer.2C_and_Sync_Production_Environments
and https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer_and_Sync_1.5_Dev_Environments

Using the Loads Services Cluster for a combined load test in Stage

  • By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
  • Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
  • Stage environment:
$ make megabench SERVER_URL=https://token.stage.mozaws.net
  • Dev environment:
$ make megabench SERVER_URL=https://token.dev.lcip.org
  • Prod environment:
$ make megabench SERVER_URL=https://token.services.mozilla.com

Configuring The Load Tests

  • The TokenServer, Sync, and Combined load tests work with config files that can be edited to change how the load tests are run.
  • For make test (TokenServer, Sync, Combined):
    • Number of hits
    • Number of concurrent users
  • For make bench (TokenServer, Sync, Combined):
    • Number of concurrent users
    • Duration of test
  • For make megabench (using the LoadsCluster with TokenSerer, Sync, Combined):
    • Number of concurrent users
    • Duration of test
    • Include file (this is code dependent)
    • Python dependencies (this is code dependent)
    • Broker to use for testing (leaves as defined for now - this is broker in the Loads Cluster)
    • Agents to use for testing (default is 5, max is currently 20, but depends on the number of concurrent load tests running)
    • Detach mode (leave as defined for now to automatically detach from the load test once it starts on the localhost)
    • Observer (this can be email or irc - the default is irc #services-dev channel)

Test Coverage and Stats

  • Basic tweakable values for all load tests
    • users = number of concurrent users/agent
    • agents = number of agents out of the cluster, otherwise errors out
    • duration = in seconds
    • hits = 1 or X number of rounds/hits/iterations
  • TokenServer
    • File location: tokenserver/loadtest/loadtest.py
    • Inside NoteAssignmentTest, test_realistic is the main load test; the others are for specific behaviors
    • The test runs as following:
95% ask for assertions on existing users (on a DB filled by test_single_token_exchange)
4% ask for assertion on a new use
1% ask for a bad assertion
    • A bug has been filed to get the following additional coverage for the load test:
      • generation numbers in assertion
      • client state string
    • A bug has been filed to get some integration tests written:
      • to cover the edge/error cases not in the load test
      • to be pointed at a remote server
  • Sync
    • File location: server-syncstorage/loadtest/stress.py
    • This is the Sync 2.0 load test that has been back-ported for Sync 1.5.
    • The stress.py file is fully configurable for the following:
      • client probability
      • client distribution
      • collections
    • A bug has been filed to add support for load testing tabs
      • The tab collection it uses memcache; we need to figure out a way to test it without overloading the server
    • There are currently no constants to define how to select percentages per collection type
    • Right now, we need to manually configure the collections list in stress.py:
      • collections = ['bookmarks', 'forms', 'passwords', 'history', 'prefs']
      • Basically, you can add more entries of each type, since the load test (per user/again/hit/pass) picks randomly from the list for any given request...

Analyzing the Results

  • There are several methods and tools for analyzing the load test results.

Debugging the Issues

  • There are several methods and tools for debugging the load test errors and other issues.
  • 1. Important logs for TokenServer (per server)
    • /media/ephemeral0/logs/
    • /media/ephemeral0/logs/nginx/access.log
    • /media/ephemeral0/logs/nginx/error.log
    • /media/ephemeral0/logs/tokenserver/token.error.log
    • /media/ephemeral0/logs/tokenserver/token.log.*
    • /media/ephemeral0/logs/tokenserver/process_account_deletions.error.log
    • /media/ephemeral0/logs/tokenserver/process_account_deletions.log
    • /media/ephemeral0/squid/access.log
    • /var/log/hekad/tokenserver.stdout.log
    • /var/log/hekad/tokenserver.stderr.log
  • 2. Important logs for Verifier (per server)
    • /media/ephemeral0/fxa-browserid-verifier/verifier_err.log
    • /media/ephemeral0/fxa-browserid-verifier/verifier_out.log
    • GONE: /media/ephemeral0/heka/hekad_err.log
    • GONE: /media/ephemeral0/heka/hekad_out.log
    • GONE: /media/ephemeral0/nginx/logs/access.log
    • GONE: /media/ephemeral0/nginx/logs/error.log
    • /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
    • /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
    • /media/ephemeral0/nginx/logs/squid/access.log
    • /var/log/hekad/fxa-browserid_verifier.stderr.log
    • /var/log/hekad/fxa-browserid_verifier.stdout.log
  • 3. Important error logs for Sync (per Sync node)
    • /media/ephemeral0/logs/
    • /media/ephemeral0/nginx/access.log
    • /media/ephemeral0/error.log
    • /media/ephemeral0/sync/sync.err
    • /media/ephemeral0/sync/sync.log


  • Acceptable TokenServer errors:
1% - 2% failures (as the following)
token.log:
"name": "token.assertion.invalid_signature_error"
"name": "token.assertion.verify_failure"
nginx access.log:
401s
NOTE: Values can be tweaked here:
    https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L58-L60
Also, it may be the case that the following errors are "acceptable" if TS Stage is larger than Verifier Stage:
/media/ephemeral0/logs/tokenserver/token.error.log
Verifier-related errors of these types:
"HttpConnectionPool is full, discarding connection: verifier.stage.mozaws.net"
"Resetting dropped connection: verifier.stage.mozaws.net"
"Starting new HTTPS connection (179): verifier.stage.mozaws.net"
  • Acceptable Verifier errors:
In the verifier and squid logs:
References to mozilla.org and login.mozilla.org - part of the "invalid domain" tests
In the verifier logs:
References to https://secret.mozilla.com, which are defined in the browserid-verifier load test
https://github.com/mozilla/browserid-verifier/blob/master/loadtest/loadtest.py#L77 for example
  • Acceptable Sync node errors:
In the nginx access.log files:
We will see some percentage of 404s. Right now we see the following:
    14% 404s (compared to the total count of 200s)
    with the config set up as follows:
         users = 20
         duration = 1800
         agents = 5
Ideally, the overall percentage of 404s should drop the longer the load test.

Monitoring TS and Sync Stage

Agents statuses
Launch a health check on all agents

Performance Testing Information

  • TBD

Details on the Load Test tool

Known Bugs, Issues, and Tasks

References