TestEngineering/Services/TokenServerAndSyncLoadTesting: Difference between revisions
< TestEngineering | Services
Jump to navigation
Jump to search
(98 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
* NOTE: We currently have two Verifier stacks in Stage (and probably Production): | |||
* | ** The standalone Browser_ID Verifier stack: See that Verifier sections below... | ||
** A Tokenserver+Verifier stack: See the TokenServer sections below... | |||
** Verifier | |||
** Tokenserver | |||
== Quick Verification Of Stage Deployments == | == Quick Verification Of Stage Deployments == | ||
* This is a quick sanity test of the environment before getting started on load tests. | * This is a quick sanity test of the environment before getting started on load tests. | ||
* TokenServer Stage environment: | |||
* TokenServer+Verifier Stage environment: | |||
From the browser: https://token.stage.mozaws.net | |||
* Verifier Stage environment: | curl https://token.stage.mozaws.net | ||
curl -I https://token.stage.mozaws.net | |||
Use the simple "make test" command from an install of tokenserver on the localhost or AWS instance. | |||
cd loadtest | |||
make test SERVER_URL=https://token.stage.mozaws.net | |||
Alternate method: | |||
Use the test tool from here: https://github.com/edmoz/fxa-sync-client | |||
Install and check all collection types for a known account in Stage: | |||
bin/sync-cli.js -e EMAIL -p PASSWORD --env stage -t COLLECTION | |||
where -t is one of bookmarks,history,passwords,tabs,addons,prefs,forms | |||
* Verifier Stage environment: | |||
In the browser: https://verifier.stage.mozaws.net/ | |||
curl https://verifier.stage.mozaws.net | |||
curl -I https://verifier.stage.mozaws.net | |||
Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance. | |||
cd loadtest | |||
make test SERVER_URL=https://verifier.stage.mozaws.net | |||
* Sync Server Stage environment: | * Sync Server Stage environment: | ||
Install server-syncstorage to the local host or AWS instance (see below) | Install server-syncstorage to the local host or AWS instance (see below) | ||
Line 67: | Line 55: | ||
** See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1006675 | ** See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1006675 | ||
== | == Quick Verification Of Production Deployments == | ||
* | * This is a quick sanity test of the environment after a new deployment. | ||
* Tokenserver+Verifier Production Environment | |||
* | In the browser: https://token.services.mozilla.com | ||
curl https://token.services.mozilla.com | |||
curl -I https://token.services.mozilla.com | |||
Then: | |||
Use the test tool from here: https://github.com/edmoz/fxa-sync-client | |||
Install and check all collection types for a known account in Production: | |||
bin/sync-cli.js -e PROD-EMAIL -p PASSWORD -t COLLECTION | |||
where -t is one of bookmarks,history,passwords,tabs,addons,prefs,forms | |||
* | * Verifier Production Environment | ||
In the browser: https://verifier.accounts.firefox.com | |||
curl https://verifier.accounts.firefox.com | |||
curl -I https://verifier.accounts.firefox.com | |||
Then: | |||
Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance. | |||
cd loadtest | |||
make test SERVER_URL=https://verifier.accounts.firefox.com | |||
* | * Sync Server Stage environment | ||
Sign in with a known FxA account and sync data with a current Production account (sync node). | |||
and | Create a new FxA account and set up sync. | ||
* | == Load Test Tool Client/Host == | ||
* It is always best to configure an AWS instance as the host for all load testing. | |||
* All load tests can now run on the localhost (the AWS instance) or against the new Loads Cluster. See the following links for more information: | |||
** https://wiki.mozilla.org/QA/Services/LoadsV1ClientTestHost | |||
** https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#Loads_Services_Cluster_Environment | |||
== Installing BrowserID-Verifier and the Loads tool on Localhost or AWS == | |||
== Installing BrowserID-Verifier and the Loads tool on | |||
* Installation: | * Installation: | ||
$ git clone git://github.com/mozilla/browserid-verifier | $ git clone git://github.com/mozilla/browserid-verifier | ||
$ cd browserid-verifier | $ cd browserid-verifier | ||
Note: You may want to install a specific branch for testing vs defaulting to Master | |||
$ npm install | $ npm install | ||
$ npm test | $ npm test | ||
Line 144: | Line 108: | ||
or | or | ||
$ make bench SERVER_URL=https://verifier.stage.mozaws.net | $ make bench SERVER_URL=https://verifier.stage.mozaws.net | ||
Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. | |||
The recommendation is to use 'make test' and 'make megabench' instead (see below)... | |||
Note: The Stage Verifier hits the Stage mockmyid server | |||
* Production environment: | * Production environment: | ||
Line 156: | Line 117: | ||
$ make bench SERVER_URL=https://verifier.accounts.firefox.com | $ make bench SERVER_URL=https://verifier.accounts.firefox.com | ||
== Using the Loads Services Cluster for the Verifier == | == Using the Loads V1 Services Cluster for the Verifier == | ||
* By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | * By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | ||
* Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | * Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | ||
Line 169: | Line 130: | ||
* REFs: | * REFs: | ||
** https://wiki.mozilla.org/QA/Services/ | ** https://wiki.mozilla.org/QA/Services/LoadsToolsAndTesting1 | ||
** https://github.com/mozilla/browserid-verifier/tree/master/loadtest | ** https://github.com/mozilla/browserid-verifier/tree/master/loadtest | ||
== Installing TokenServer and the Loads tool on | == Installing TokenServer+Verifier and the Loads tool on Localhost or AWS == | ||
* Installation: | * Installation: | ||
$ git clone https://github.com/mozilla-services/tokenserver | $ git clone https://github.com/mozilla-services/tokenserver | ||
$ cd tokenserver | $ cd tokenserver | ||
Note: You may want to install a specific branch for testing vs defaulting to Master | |||
$ make build | $ make build | ||
$ make test | $ make test | ||
Line 183: | Line 145: | ||
Note: This should hit Prod by default: SERVER_URL=https://token.services.mozilla.com | Note: This should hit Prod by default: SERVER_URL=https://token.services.mozilla.com | ||
* Note: This will install a local copy of the Loads tool for use with TokenServer. | * Note: This will install a local copy of the Loads tool for use with TokenServer+Verifier. | ||
== Running the load test against TokenServer in Stage == | == Running the load test against TokenServer+Verifier in Stage == | ||
* Stage environment: | * Stage environment: | ||
$ make test SERVER_URL=https://token.stage.mozaws.net | $ make test SERVER_URL=https://token.stage.mozaws.net | ||
$ make bench SERVER_URL=https://token.stage.mozaws.net | $ make bench SERVER_URL=https://token.stage.mozaws.net | ||
Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. | |||
The recommendation is to use 'make test' and 'make megabench' instead (see below)... | |||
Note: This also hits the Stage Verifier, which in turns hits the Stage mockmyid server | |||
* And while we are at it... | * And while we are at it... | ||
Line 201: | Line 165: | ||
$ make bench SERVER_URL=https://token.services.mozilla.com | $ make bench SERVER_URL=https://token.services.mozilla.com | ||
== Using the Loads Services Cluster for TokenServer == | == Using the Loads V1 Services Cluster for TokenServer+Verifier == | ||
* By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | * By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | ||
* Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | * Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | ||
Line 214: | Line 178: | ||
* REFs: | * REFs: | ||
** https://wiki.mozilla.org/QA/Services/ | ** https://wiki.mozilla.org/QA/Services/LoadsToolsAndTesting1 | ||
** https://github.com/mozilla-services/tokenserver/tree/master/loadtest | ** https://github.com/mozilla-services/tokenserver/tree/master/loadtest | ||
== Installing Sync | == Installing Sync and load testing on Localhost or AWS == | ||
Installation: | Installation: | ||
$ git clone https://github.com/mozilla-services/ | $ git clone https://github.com/mozilla-services/syncstorage-loadtest/ | ||
$ cd | $ cd syncstorage-loadtest | ||
Note: You may want to install a specific branch for testing vs defaulting to Master | |||
$ pip install -r requirements.txt | |||
== Running the load test against Sync 1.5 in Stage == | == Running the load test against Sync 1.5 in Stage == | ||
* Loads against specific Sync nodes in Stage | * Loads against specific Sync nodes in Stage | ||
$ | $ export SERVER_URL=https://your.storagenode.here#SECRET | ||
Sync Stage nodes: | Sync Stage nodes: | ||
https://sync-1-us-east-1.stage.mozaws.net | https://sync-1-us-east-1.stage.mozaws.net | ||
https://sync-2-us-east-1.stage.mozaws.net | https://sync-2-us-east-1.stage.mozaws.net | ||
...etc... | |||
NOTE: The OPs team has the SECRET string for Stage. Get it from them before you start testing. | NOTE: The OPs team has the SECRET string for Stage. Get it from them before you start testing. | ||
* Load testing with Molotov: https://molotov.readthedocs.io/en/stable/ | |||
* | $ bin/molotov [commands] loadtest.py | ||
$ | |||
== Using the Loads V1 Services Cluster for Sync 1.5 in Stage == | |||
* | * loadtesting from server-syncstorage has been deprecated, please refer to mozilla-services/syncstorage-loadtest | ||
** https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest | ** https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest | ||
== Running a combined load test against TokenServer and Sync 1.5 in Stage == | == Running a combined load test against TokenServer+Verifier and Sync 1.5 in Stage == | ||
* A combined loads test against TokenServer and Sync 1.5 in Stage | * A combined loads test against TokenServer and Sync 1.5 in Stage | ||
* This is done via the server-syncstorage directory that was cloned and built above | * This is done via the server-syncstorage directory that was cloned and built above | ||
Line 263: | Line 216: | ||
$ make test SERVER_URL=https://token.stage.mozaws.net | $ make test SERVER_URL=https://token.stage.mozaws.net | ||
$ make bench SERVER_URL=https://token.stage.mozaws.net | $ make bench SERVER_URL=https://token.stage.mozaws.net | ||
See https://wiki.mozilla.org/QA/Services/ | See https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer.2BVerifier_Stage_Environment | ||
Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. | |||
The recommendation is to use 'make test' and 'make megabench' instead (see below)... | |||
Note: The Stage Tokenserver hits the Stage Verifier, which, in turn, hits the mockmyid server. | |||
* And while we are at it... | * And while we are at it... | ||
Line 277: | Line 234: | ||
See https://wiki.mozilla.org/QA/Services/FxATestEnvironments#FxA.2C_TokenServer.2C_and_Sync_Production_Environments | See https://wiki.mozilla.org/QA/Services/FxATestEnvironments#FxA.2C_TokenServer.2C_and_Sync_Production_Environments | ||
and https://wiki.mozilla.org/QA/Services/ | and https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer_and_Sync_1.5_Dev_Environments | ||
== Using the Loads Services Cluster for a combined load test in Stage == | == Using the Loads V1 Services Cluster for a combined load test in Stage == | ||
* By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | * By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory. | ||
* Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | * Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench). | ||
Line 292: | Line 249: | ||
* REFs: | * REFs: | ||
** https://wiki.mozilla.org/QA/Services/ | ** https://wiki.mozilla.org/QA/Services/LoadsToolsAndTesting1 | ||
** https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest | ** https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest | ||
== Configuring The Load Tests == | == Configuring The Load Tests == | ||
* The TokenServer, Sync, | * Makefile | ||
** The SERVER_URL constant can be changed. | |||
* Config files | |||
** For make test (BrowserID-Verifier, TokenServer, Sync, Combined): | |||
*** Number of hits | |||
*** Number of concurrent users | |||
** For make bench (BrowserID-Verifier, TokenServer, Sync, Combined): | |||
*** Number of concurrent users | |||
*** Duration of test | |||
** For make megabench (using the LoadsCluster with BrowserID-Verifier, TokenSerer, Sync, Combined): | |||
*** Number of concurrent users | |||
*** Duration of test | |||
*** Include file (this is code dependent) | |||
*** Python dependencies (this is code dependent) | |||
*** Agents to use for testing (default is 5, max is currently 20, but depends on the number of concurrent load tests running) | |||
*** Detach mode (leave as defined for now to automatically detach from the load test once it starts on the localhost) | |||
*** Observer (this can be email or irc - the default is irc #services-dev channel) | |||
*** SSH (the user account needed to SSH into the loads cluster - the default is ubuntu) | |||
* | * Tokenserver load test code | ||
** | ** The Tokenserver load test can be configured - see the following lines: | ||
** | ** Basic Settings: https://github.com/mozilla-services/loop-server/blob/master/loadtests/loadtest.py | ||
** MockMyID: https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L19-L36 | |||
** Percentages: https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L39-L51 | |||
* | * Verifier load test code | ||
** | ** The Verifier load test can be configured - see the following lines: | ||
** | ** Various settings: https://github.com/mozilla/browserid-verifier/blob/master/loadtest/loadtest.py#L13-L53 | ||
* | * Sync Server load test code | ||
** The Sync Server load test can be configured - see the following lines: | |||
** Setting MockMyID: https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py#L26-L45 | |||
** Setting test distributions: https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py#L48-L83 | |||
** | |||
** | |||
** | |||
* | * REFs: | ||
* | ** https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py | ||
** https://github.com/mozilla/browserid-verifier/blob/master/loadtest/loadtest.py | |||
** https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py | |||
== Test Coverage and Stats == | == Test Coverage and Stats == | ||
Line 362: | Line 338: | ||
*** or http://ec2-54-212-44-143.us-west-2.compute.amazonaws.com (direct) | *** or http://ec2-54-212-44-143.us-west-2.compute.amazonaws.com (direct) | ||
** You can quickly review the following here: Status, Configuration, Results, Custom Metrics, and Errors. | ** You can quickly review the following here: Status, Configuration, Results, Custom Metrics, and Errors. | ||
** If you want more details on the dashboard, please file an issue here: https://github.com/mozilla-services/loads | |||
* Tokenserver Custom Metrics | |||
** addFailure | |||
* Verifier Custom Metrics | |||
** addFailure | |||
* Sync Custom Metrics | |||
** addFailure | |||
* NOTE: If you want more details on the dashboard, please file an issue here: https://github.com/mozilla-services/loads | |||
== Debugging the Issues == | == Debugging the Issues == | ||
Line 369: | Line 355: | ||
* 1. Important logs for TokenServer (per server) | * 1. Important logs for TokenServer (per server) | ||
** /media/ephemeral0/logs/ | ** /media/ephemeral0/logs/ | ||
** /media/ephemeral0/logs/nginx/ | ** /media/ephemeral0/nginx/logs/default.access.log | ||
** /media/ephemeral0/logs/nginx/error.log | ** /media/ephemeral0/nginx/logs/default.error.log | ||
** /media/ephemeral0/nginx/logs/tokenserver.access.log | |||
** /media/ephemeral0/nginx/logs/tokenserver.error.log | |||
** /media/ephemeral0/logs/tokenserver/token.error.log | ** /media/ephemeral0/logs/tokenserver/token.error.log | ||
** /media/ephemeral0/logs/tokenserver/token.log | ** /media/ephemeral0/logs/tokenserver/token.log.* | ||
** /media/ephemeral0/logs/tokenserver/process_account_deletions.error.log | |||
** /media/ephemeral0/logs/tokenserver/process_account_deletions.log | |||
** /media/ephemeral0/logs/tokenserver/purge_old_records.log | |||
** /media/ephemeral0/logs/tokenserver/purge_old_records.error.log | |||
** /media/ephemeral0/fxa-browserid-verifier/verifier_err.log | |||
** /media/ephemeral0/fxa-browserid-verifier/verifier_out.log | |||
** /var/log/circus.log | |||
** /var/log/hekad/tokenserver.stdout.log | |||
** /var/log/hekad/tokenserver.stderr.log | |||
* 2. Important logs for Verifier (per server) | * 2. Important logs for Verifier (per server) | ||
** /media/ephemeral0/fxa-browserid-verifier/verifier_err.log | ** /media/ephemeral0/fxa-browserid-verifier/verifier_err.log | ||
** /media/ephemeral0/fxa-browserid-verifier/verifier_out.log | ** /media/ephemeral0/fxa-browserid-verifier/verifier_out.log | ||
** /media/ephemeral0/ | ** /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log | ||
** /media/ephemeral0/ | ** /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log | ||
** /media/ephemeral0/nginx/logs/access.log | ** /media/ephemeral0/nginx/logs/default.access.log (not in use) | ||
** /media/ephemeral0/nginx/logs/error.log | ** /media/ephemeral0/nginx/logs/default.error.log (not in use) | ||
** /media/ephemeral0/ | ** /media/ephemeral0/squid/access.log | ||
** /var/log/circus.log | |||
** /var/log/hekad/fxa-browserid_verifier.stderr.log | |||
** /var/log/hekad/fxa-browserid_verifier.stdout.log | |||
* 3. Important error logs for Sync (per Sync node) | * 3. Important error logs for Sync (per Sync node) | ||
Line 400: | Line 400: | ||
NOTE: Values can be tweaked here: | NOTE: Values can be tweaked here: | ||
https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L58-L60 | https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L58-L60 | ||
The following types of errors are known: | |||
/media/ephemeral0/logs/tokenserver/token.error.log | |||
Exception KeyError: KeyError(49564400,) in <module 'threading'... | |||
/media/ephemeral0/logs/tokenserver/token.log | |||
..."Starting new HTTP connection (9): 127.0.0.1", "hostname": ... | |||
{"error": "StopIteration()", "traceback": "Uncaught exception:\n | |||
File \"/data/tokenserver/local/lib/python2.6/site-packages/gunicorn/workers/async.py\"... | |||
..."Connection pool is full, discarding connection: 127.0.0.1", "... | |||
Also, any 499s are probably an artifact of the current (V1) load test. | |||
REF: | |||
https://bugzilla.mozilla.org/show_bug.cgi?id=1040396 | |||
https://bugzilla.mozilla.org/show_bug.cgi?id=1040397 | |||
OLD: Also, it may be the case that the following errors are "acceptable" if TS Stage is larger than Verifier Stage: | |||
/media/ephemeral0/logs/tokenserver/token.error.log | |||
Verifier-related errors of these types: | |||
"HttpConnectionPool is full, discarding connection: verifier.stage.mozaws.net" | |||
"Resetting dropped connection: verifier.stage.mozaws.net" | |||
"Starting new HTTPS connection (179): verifier.stage.mozaws.net" | |||
* Acceptable Verifier errors: | * Acceptable Verifier errors: | ||
The verifier_out.log will show errors of the following types: | |||
result: 'failure',\n reason: 'untrusted issuer...' | |||
result: 'failure',\n reason: 'expired' | |||
result: 'failure',\n reason: 'algorithms do not match' | |||
result: 'failure',\n reason: 'audience mismatch: scheme mismatch' | |||
Also, any 499s in the nginx logs are probably an artifact of the current (V1) load test. | |||
* Acceptable Sync node errors: | * Acceptable Sync node errors: | ||
Line 417: | Line 438: | ||
agents = 5 | agents = 5 | ||
Ideally, the overall percentage of 404s should drop the longer the load test. | Ideally, the overall percentage of 404s should drop the longer the load test. | ||
Usually, you will not see 304s, 400s, 412s, or 415s for a load test, | |||
although they may show up in the logs after running the remote integration tests. | |||
Also, any 499s are probably an artifact of the current (V1) load test. | |||
In /var/log/hekad/sync_1_5.stderr.log | |||
You may see some Decoder 'Sync-1_5-SlowQuery-MySqlSlowQueryDecoder' error: Failed parsing | |||
and a lot of BSO INSERTs | |||
In /media/ephemeral0/logs/sync/sync.err | |||
You should see expected skew and QueuePool messages and Deprecation warnings | |||
Also, these are known | |||
Exception SystemExit | |||
Exception KeyError | |||
This is probably https://bugzilla.mozilla.org/show_bug.cgi?id=1040397 | |||
== Monitoring TS and Sync Stage == | == Monitoring TS and Sync Stage == | ||
* Loads dashboard: | * Loads dashboard: | ||
** | ** http://loads.services.mozilla.com | ||
* Cluster status | * Cluster status | ||
** Check from | ** Check directly from the Loads Cluster dashboard: | ||
Agents statuses | |||
Launch a health check on all agents | |||
* and also on StackDriver: https://app.stackdriver.com/groups/6664/stage-loads-cluster | * and also on StackDriver: https://app.stackdriver.com/groups/6664/stage-loads-cluster | ||
* | * For all other monitoring, see the following section: | ||
** | ** https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#Monitoring_the_Stage_Environment | ||
== Performance Testing Information == | == Performance Testing Information == | ||
Line 460: | Line 479: | ||
== Known Bugs, Issues, and Tasks == | == Known Bugs, Issues, and Tasks == | ||
* Tokenserver | * Tokenserver: | ||
** Repo: https://github.com/mozilla-services/tokenserver/issues | |||
** | ** Bugzilla: http://mzl.la/1s4qZKn | ||
** | |||
* | * BrowserID-Verifier: | ||
** Repo: https://github.com/mozilla/browserid-verifier/issues | |||
** Bugzilla: no specific cateogory | |||
** | |||
** | |||
* | * Sync: | ||
** | ** Repo: https://github.com/mozilla-services/server-syncstorage/issues | ||
** Bugzilla: http://mzl.la/VUrYQ5 | |||
** | |||
* OPs and Infrastructure | * OPs and Infrastructure | ||
** https://github.com/mozilla-services/puppet-config/issues | |||
** https://github.com/mozilla-services/svcops/issues | |||
** https://github.com/mozilla-services/puppet-config/issues | |||
** https://github.com/mozilla-services/svcops/issues | |||
* Loads Tool and Cluster | * Loads Tool and Cluster | ||
** https://github.com/mozilla-services/loads/issues | ** https://github.com/mozilla-services/loads/issues | ||
** https://github.com/mozilla-services/loads-aws/issues | |||
** https://github.com/mozilla-services/loads-web/issues | |||
** https://github.com/mozilla-services/loads | |||
** https://github.com/mozilla-services/loads-web/issues | |||
== References == | == References == | ||
Line 581: | Line 516: | ||
** https://wiki.mozilla.org/Services/Sagrada/TokenServer | ** https://wiki.mozilla.org/Services/Sagrada/TokenServer | ||
** https://docs.services.mozilla.com/sync/ | ** https://docs.services.mozilla.com/sync/ | ||
* The QA Test Environments: https://wiki.mozilla.org/QA/Services/FxATestEnvironments | * The QA Test Environments: | ||
** https://wiki.mozilla.org/QA/Services/FxATestEnvironments | |||
** https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments | |||
* Deploying the FxA Load Test environment for broker/agents usage: | * Deploying the FxA Load Test environment for broker/agents usage: | ||
** https://github.com/mozilla/fxa-deployment | ** https://github.com/mozilla/fxa-deployment |
Latest revision as of 14:32, 1 March 2019
- NOTE: We currently have two Verifier stacks in Stage (and probably Production):
- The standalone Browser_ID Verifier stack: See that Verifier sections below...
- A Tokenserver+Verifier stack: See the TokenServer sections below...
Quick Verification Of Stage Deployments
- This is a quick sanity test of the environment before getting started on load tests.
- TokenServer+Verifier Stage environment:
From the browser: https://token.stage.mozaws.net curl https://token.stage.mozaws.net curl -I https://token.stage.mozaws.net Use the simple "make test" command from an install of tokenserver on the localhost or AWS instance. cd loadtest make test SERVER_URL=https://token.stage.mozaws.net Alternate method: Use the test tool from here: https://github.com/edmoz/fxa-sync-client Install and check all collection types for a known account in Stage: bin/sync-cli.js -e EMAIL -p PASSWORD --env stage -t COLLECTION where -t is one of bookmarks,history,passwords,tabs,addons,prefs,forms
- Verifier Stage environment:
In the browser: https://verifier.stage.mozaws.net/ curl https://verifier.stage.mozaws.net curl -I https://verifier.stage.mozaws.net Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance. cd loadtest make test SERVER_URL=https://verifier.stage.mozaws.net
- Sync Server Stage environment:
Install server-syncstorage to the local host or AWS instance (see below) $ cd server-syncstorage Quick test against the TokenServer $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py --use-token-server <Stage TokenServer> Current example: $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py --use-token-server https://token.stage.mozaws.net/1.0/sync/1.5 Quick tests against the Sync nodes $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py <Stage Sync Node>#<Node Secret> Current examples: $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py https://sync-1-us-east-1.stage.mozaws.net#<Node Secret> $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py https://sync-1-us-east-1.stage.mozaws.net#<Node Secret> $ ./local/bin/python ./syncstorage/tests/functional/test_storage.py https://sync-1-us-east-1.stage.mozaws.net#<Node Secret> Get the Node Secret information from OPs
- Using TPS
- The TPS FxA/Sync automated tests can be used as well, but the following file will have to be edited to add Stage environment configuration parameters: https://github.com/mozilla/gecko-dev/blob/master/testing/tps/tps/testrunner.py
- See the following wiki page for more information: https://wiki.mozilla.org/User_Services/Sync/Run_TPS
- See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1006675
Quick Verification Of Production Deployments
- This is a quick sanity test of the environment after a new deployment.
- Tokenserver+Verifier Production Environment
In the browser: https://token.services.mozilla.com curl https://token.services.mozilla.com curl -I https://token.services.mozilla.com Then: Use the test tool from here: https://github.com/edmoz/fxa-sync-client Install and check all collection types for a known account in Production: bin/sync-cli.js -e PROD-EMAIL -p PASSWORD -t COLLECTION where -t is one of bookmarks,history,passwords,tabs,addons,prefs,forms
- Verifier Production Environment
In the browser: https://verifier.accounts.firefox.com curl https://verifier.accounts.firefox.com curl -I https://verifier.accounts.firefox.com Then: Use the simple "make test" command from an install of browserid-verifier on the localhost or AWS instance. cd loadtest make test SERVER_URL=https://verifier.accounts.firefox.com
- Sync Server Stage environment
Sign in with a known FxA account and sync data with a current Production account (sync node). Create a new FxA account and set up sync.
Load Test Tool Client/Host
- It is always best to configure an AWS instance as the host for all load testing.
- All load tests can now run on the localhost (the AWS instance) or against the new Loads Cluster. See the following links for more information:
Installing BrowserID-Verifier and the Loads tool on Localhost or AWS
- Installation:
$ git clone git://github.com/mozilla/browserid-verifier $ cd browserid-verifier Note: You may want to install a specific branch for testing vs defaulting to Master $ npm install $ npm test $ cd loadtest $ make build Note: This should hit Stage by default: SERVER_URL=https://verifier.stage.mozaws.net
- Note: This will install a local copy of the Loads tool for use with the verifier.
Running the load test against the Verifier in Stage
- Stage environment:
$ make test or $ make test SERVER_URL=https://verifier.stage.mozaws.net $ make bench or $ make bench SERVER_URL=https://verifier.stage.mozaws.net Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. The recommendation is to use 'make test' and 'make megabench' instead (see below)... Note: The Stage Verifier hits the Stage mockmyid server
- Production environment:
$ make test SERVER_URL=https://verifier.accounts.firefox.com $ make bench SERVER_URL=https://verifier.accounts.firefox.com
Using the Loads V1 Services Cluster for the Verifier
- By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
- Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
- Stage environment:
$ make megabench SERVER_URL=https://verifier.stage.mozaws.net
- Dev environment:
$ make megabench SERVER_URL=TBD
- Production environment:
$ make megabench SERVER_URL=https://verifier.accounts.firefox.com
- REFs:
Installing TokenServer+Verifier and the Loads tool on Localhost or AWS
- Installation:
$ git clone https://github.com/mozilla-services/tokenserver $ cd tokenserver Note: You may want to install a specific branch for testing vs defaulting to Master $ make build $ make test Note: This is for local testing only $ cd loadtest $ make build Note: This should hit Prod by default: SERVER_URL=https://token.services.mozilla.com
- Note: This will install a local copy of the Loads tool for use with TokenServer+Verifier.
Running the load test against TokenServer+Verifier in Stage
- Stage environment:
$ make test SERVER_URL=https://token.stage.mozaws.net $ make bench SERVER_URL=https://token.stage.mozaws.net Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. The recommendation is to use 'make test' and 'make megabench' instead (see below)... Note: This also hits the Stage Verifier, which in turns hits the Stage mockmyid server
- And while we are at it...
- Dev environment:
$ make test SERVER_URL=https://token.dev.lcip.org $ make bench SERVER_URL=https://token.dev.lcip.org
- Production environment:
$ make test SERVER_URL=https://token.services.mozilla.com $ make bench SERVER_URL=https://token.services.mozilla.com
Using the Loads V1 Services Cluster for TokenServer+Verifier
- By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
- Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
- Stage environment:
$ make megabench SERVER_URL=https://token.stage.mozaws.net
- Dev environment:
$ make megabench SERVER_URL=https://token.dev.lcip.org
- Production environment:
$ make megabench SERVER_URL=https://token.services.mozilla.com
- REFs:
Installing Sync and load testing on Localhost or AWS
Installation: $ git clone https://github.com/mozilla-services/syncstorage-loadtest/ $ cd syncstorage-loadtest Note: You may want to install a specific branch for testing vs defaulting to Master $ pip install -r requirements.txt
Running the load test against Sync 1.5 in Stage
- Loads against specific Sync nodes in Stage
$ export SERVER_URL=https://your.storagenode.here#SECRET Sync Stage nodes: https://sync-1-us-east-1.stage.mozaws.net https://sync-2-us-east-1.stage.mozaws.net ...etc... NOTE: The OPs team has the SECRET string for Stage. Get it from them before you start testing.
- Load testing with Molotov: https://molotov.readthedocs.io/en/stable/
$ bin/molotov [commands] loadtest.py
Using the Loads V1 Services Cluster for Sync 1.5 in Stage
- loadtesting from server-syncstorage has been deprecated, please refer to mozilla-services/syncstorage-loadtest
Running a combined load test against TokenServer+Verifier and Sync 1.5 in Stage
- A combined loads test against TokenServer and Sync 1.5 in Stage
- This is done via the server-syncstorage directory that was cloned and built above
$ cd server-syncstorage $ cd loadtest $ make test SERVER_URL=https://your.tokenserver.here $ make bench SERVER_URL=https://your.tokenserver.here Examples for Stage: $ make test SERVER_URL=https://token.stage.mozaws.net $ make bench SERVER_URL=https://token.stage.mozaws.net See https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer.2BVerifier_Stage_Environment Note: the current version of 'make bench' tends to use a lot of CPU and Memory on the localhost. The recommendation is to use 'make test' and 'make megabench' instead (see below)... Note: The Stage Tokenserver hits the Stage Verifier, which, in turn, hits the mockmyid server.
- And while we are at it...
Dev environment: Examples: $ make test SERVER_URL=https://token.dev.lcip.org $ make bench SERVER_URL=https://token.dev.lcip.org Prod environment: Examples: $ make test SERVER_URL=https://token.services.mozilla.com $ make bench SERVER_URL=https://token.services.mozilla.com See https://wiki.mozilla.org/QA/Services/FxATestEnvironments#FxA.2C_TokenServer.2C_and_Sync_Production_Environments and https://wiki.mozilla.org/QA/Services/TSVerifierSyncTestEnvironments#TokenServer_and_Sync_1.5_Dev_Environments
Using the Loads V1 Services Cluster for a combined load test in Stage
- By using the Loads Services Cluster, we can offload the broker/agents processes and save client-side CPU and memory.
- Changes were made to Makefile and the load test to use the cluster and some associated config files (for test, bench, megabench).
- Stage environment:
$ make megabench SERVER_URL=https://token.stage.mozaws.net
- Dev environment:
$ make megabench SERVER_URL=https://token.dev.lcip.org
- Prod environment:
$ make megabench SERVER_URL=https://token.services.mozilla.com
- REFs:
Configuring The Load Tests
- Makefile
- The SERVER_URL constant can be changed.
- Config files
- For make test (BrowserID-Verifier, TokenServer, Sync, Combined):
- Number of hits
- Number of concurrent users
- For make test (BrowserID-Verifier, TokenServer, Sync, Combined):
- For make bench (BrowserID-Verifier, TokenServer, Sync, Combined):
- Number of concurrent users
- Duration of test
- For make bench (BrowserID-Verifier, TokenServer, Sync, Combined):
- For make megabench (using the LoadsCluster with BrowserID-Verifier, TokenSerer, Sync, Combined):
- Number of concurrent users
- Duration of test
- Include file (this is code dependent)
- Python dependencies (this is code dependent)
- Agents to use for testing (default is 5, max is currently 20, but depends on the number of concurrent load tests running)
- Detach mode (leave as defined for now to automatically detach from the load test once it starts on the localhost)
- Observer (this can be email or irc - the default is irc #services-dev channel)
- SSH (the user account needed to SSH into the loads cluster - the default is ubuntu)
- For make megabench (using the LoadsCluster with BrowserID-Verifier, TokenSerer, Sync, Combined):
- Tokenserver load test code
- The Tokenserver load test can be configured - see the following lines:
- Basic Settings: https://github.com/mozilla-services/loop-server/blob/master/loadtests/loadtest.py
- MockMyID: https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L19-L36
- Percentages: https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L39-L51
- Verifier load test code
- The Verifier load test can be configured - see the following lines:
- Various settings: https://github.com/mozilla/browserid-verifier/blob/master/loadtest/loadtest.py#L13-L53
- Sync Server load test code
- The Sync Server load test can be configured - see the following lines:
- Setting MockMyID: https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py#L26-L45
- Setting test distributions: https://github.com/mozilla-services/server-syncstorage/blob/master/loadtest/stress.py#L48-L83
- REFs:
Test Coverage and Stats
- Basic tweakable values for all load tests
- users = number of concurrent users/agent
- agents = number of agents out of the cluster, otherwise errors out
- duration = in seconds
- hits = 1 or X number of rounds/hits/iterations
- TokenServer
- File location: tokenserver/loadtest/loadtest.py
- Inside NoteAssignmentTest, test_realistic is the main load test; the others are for specific behaviors
- The test runs as following:
95% ask for assertions on existing users (on a DB filled by test_single_token_exchange) 4% ask for assertion on a new use 1% ask for a bad assertion
- A bug has been filed to get the following additional coverage for the load test:
- generation numbers in assertion
- client state string
- A bug has been filed to get some integration tests written:
- to cover the edge/error cases not in the load test
- to be pointed at a remote server
- A bug has been filed to get the following additional coverage for the load test:
- Sync
- File location: server-syncstorage/loadtest/stress.py
- This is the Sync 2.0 load test that has been back-ported for Sync 1.5.
- The stress.py file is fully configurable for the following:
- client probability
- client distribution
- collections
- A bug has been filed to add support for load testing tabs
- The tab collection it uses memcache; we need to figure out a way to test it without overloading the server
- There are currently no constants to define how to select percentages per collection type
- Right now, we need to manually configure the collections list in stress.py:
- collections = ['bookmarks', 'forms', 'passwords', 'history', 'prefs']
- Basically, you can add more entries of each type, since the load test (per user/again/hit/pass) picks randomly from the list for any given request...
Analyzing the Results
- There are several methods and tools for analyzing the load test results.
- 1. Using the Loads Services Cluster dashboard
- All loads tests using this cluster generate a live report and a run report available on this site:
- You can quickly review the following here: Status, Configuration, Results, Custom Metrics, and Errors.
- Tokenserver Custom Metrics
- addFailure
- Verifier Custom Metrics
- addFailure
- Sync Custom Metrics
- addFailure
- NOTE: If you want more details on the dashboard, please file an issue here: https://github.com/mozilla-services/loads
Debugging the Issues
- There are several methods and tools for debugging the load test errors and other issues.
- 1. Important logs for TokenServer (per server)
- /media/ephemeral0/logs/
- /media/ephemeral0/nginx/logs/default.access.log
- /media/ephemeral0/nginx/logs/default.error.log
- /media/ephemeral0/nginx/logs/tokenserver.access.log
- /media/ephemeral0/nginx/logs/tokenserver.error.log
- /media/ephemeral0/logs/tokenserver/token.error.log
- /media/ephemeral0/logs/tokenserver/token.log.*
- /media/ephemeral0/logs/tokenserver/process_account_deletions.error.log
- /media/ephemeral0/logs/tokenserver/process_account_deletions.log
- /media/ephemeral0/logs/tokenserver/purge_old_records.log
- /media/ephemeral0/logs/tokenserver/purge_old_records.error.log
- /media/ephemeral0/fxa-browserid-verifier/verifier_err.log
- /media/ephemeral0/fxa-browserid-verifier/verifier_out.log
- /var/log/circus.log
- /var/log/hekad/tokenserver.stdout.log
- /var/log/hekad/tokenserver.stderr.log
- 2. Important logs for Verifier (per server)
- /media/ephemeral0/fxa-browserid-verifier/verifier_err.log
- /media/ephemeral0/fxa-browserid-verifier/verifier_out.log
- /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
- /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
- /media/ephemeral0/nginx/logs/default.access.log (not in use)
- /media/ephemeral0/nginx/logs/default.error.log (not in use)
- /media/ephemeral0/squid/access.log
- /var/log/circus.log
- /var/log/hekad/fxa-browserid_verifier.stderr.log
- /var/log/hekad/fxa-browserid_verifier.stdout.log
- 3. Important error logs for Sync (per Sync node)
- /media/ephemeral0/logs/
- /media/ephemeral0/nginx/access.log
- /media/ephemeral0/error.log
- /media/ephemeral0/sync/sync.err
- /media/ephemeral0/sync/sync.log
- Acceptable TokenServer errors:
1% - 2% failures (as the following) token.log: "name": "token.assertion.invalid_signature_error" "name": "token.assertion.verify_failure" nginx access.log: 401s NOTE: Values can be tweaked here: https://github.com/mozilla-services/tokenserver/blob/master/loadtest/loadtest.py#L58-L60 The following types of errors are known: /media/ephemeral0/logs/tokenserver/token.error.log Exception KeyError: KeyError(49564400,) in <module 'threading'... /media/ephemeral0/logs/tokenserver/token.log ..."Starting new HTTP connection (9): 127.0.0.1", "hostname": ... {"error": "StopIteration()", "traceback": "Uncaught exception:\n File \"/data/tokenserver/local/lib/python2.6/site-packages/gunicorn/workers/async.py\"... ..."Connection pool is full, discarding connection: 127.0.0.1", "... Also, any 499s are probably an artifact of the current (V1) load test. REF: https://bugzilla.mozilla.org/show_bug.cgi?id=1040396 https://bugzilla.mozilla.org/show_bug.cgi?id=1040397 OLD: Also, it may be the case that the following errors are "acceptable" if TS Stage is larger than Verifier Stage: /media/ephemeral0/logs/tokenserver/token.error.log Verifier-related errors of these types: "HttpConnectionPool is full, discarding connection: verifier.stage.mozaws.net" "Resetting dropped connection: verifier.stage.mozaws.net" "Starting new HTTPS connection (179): verifier.stage.mozaws.net"
- Acceptable Verifier errors:
The verifier_out.log will show errors of the following types: result: 'failure',\n reason: 'untrusted issuer...' result: 'failure',\n reason: 'expired' result: 'failure',\n reason: 'algorithms do not match' result: 'failure',\n reason: 'audience mismatch: scheme mismatch' Also, any 499s in the nginx logs are probably an artifact of the current (V1) load test.
- Acceptable Sync node errors:
In the nginx access.log files: We will see some percentage of 404s. Right now we see the following: 14% 404s (compared to the total count of 200s) with the config set up as follows: users = 20 duration = 1800 agents = 5 Ideally, the overall percentage of 404s should drop the longer the load test. Usually, you will not see 304s, 400s, 412s, or 415s for a load test, although they may show up in the logs after running the remote integration tests. Also, any 499s are probably an artifact of the current (V1) load test. In /var/log/hekad/sync_1_5.stderr.log You may see some Decoder 'Sync-1_5-SlowQuery-MySqlSlowQueryDecoder' error: Failed parsing and a lot of BSO INSERTs In /media/ephemeral0/logs/sync/sync.err You should see expected skew and QueuePool messages and Deprecation warnings Also, these are known Exception SystemExit Exception KeyError This is probably https://bugzilla.mozilla.org/show_bug.cgi?id=1040397
Monitoring TS and Sync Stage
- Loads dashboard:
- Cluster status
- Check directly from the Loads Cluster dashboard:
Agents statuses Launch a health check on all agents
- and also on StackDriver: https://app.stackdriver.com/groups/6664/stage-loads-cluster
- For all other monitoring, see the following section:
Performance Testing Information
- TBD
Details on the Load Test tool
- The documentation can be found here:
- The repositories are here:
- The Services cluster is here:
Known Bugs, Issues, and Tasks
- Tokenserver:
- BrowserID-Verifier:
- Repo: https://github.com/mozilla/browserid-verifier/issues
- Bugzilla: no specific cateogory
- Sync:
- OPs and Infrastructure
- Loads Tool and Cluster
References
- Other URLs
- Repositories
- Documentation
- The QA Test Environments:
- Deploying the FxA Load Test environment for broker/agents usage:
- Sync 1.5 protocol, documentation, etc.
- https://github.com/mozilla-services/docs
- https://docs.services.mozilla.com/#how-to
- https://docs.services.mozilla.com/howtos/run-fxa.html
- https://docs.services.mozilla.com/token/apis.html
- https://docs.services.mozilla.com/storage/apis-1.5.html
- https://docs.services.mozilla.com/howtos/run-sync-1.5.html
- https://docs.services.mozilla.com/howtos/run-sync-1.5.html
- https://github.com/mozilla-services/syncserver
- OPs pages for stats collection, logging, monitoring
- TBD