Weave Load Test Plan/IssuesAndQuestions

From MozillaWiki
Jump to: navigation, search
  • From Toby
    • How do we measure performance of the components of the system (logic layer, auth, dbs) individually?
    • How do we model user traffic in a way that actually reflects the real world? (POSTs, correct timestamps, etc - this is kind of the answer to 5 and 6 below. It's non-trivial)
    • If we want to measure relative performance of 2 APIs, how can we do so?
    • Does the server handle the loss of a non-essential component (e.g. memcache) or at least degrade gracefully?
    • Does the client handle mass system problems gracefully?
    • Can the server handle the additional load needed to generate daily metrics without affecting user-facing performance? Do we need to rethink how we handle these?
  • From Matt
    • Production system
      • Are there are production and staging network architecture diagram?
        • Where are they located?
        • What are the machines involved?
      • What is the network management system used to monitor and control the production system?
      • How do we determine the current load on the system?
      • What is the operating range of load on the system?
      • What is the maximum sustainable level of load?
      • Are there load generation characteristics we want to do during 1.3 load test
        • gradual increase
        • cyclical
        • smooth
        • spike
        • random
    • Admin operations
      • How do you reset the server system? (remove all data) and start fresh?
      • How do you monitor the health of the server system?
      • How do you upgrade the system with new server software?
      • How do you know that the system is updated correctly with the latest version of the software?
      • Are there generated reports available for the system?
        • Users
        • Usage
        • Alerts
      • What log files are available on the server?
  • Transactions
    • What are the key client transactions within the system as perceived by the server?
    • What are the key transaction metrics that can be measured programmatically?
  • Monitoring
    • Server side
      • What are the monitored health metrics?
        • what are the value ranges for each?
        • How are these metrics gathered and presented?
      • What are the potential critical alerts?
      • What are the key performance metrics?
    • Client side
      • What are the key client operational performance metrics?
      • response time of what?
        • what is an acceptable range?
      • How do you gather performance from the client?
        • Activity log
          • default version is not helpful for performance measurements
            • can the Activity log be enhanced to provide better data
              • transactions
              • time granularity in milliseconds
    • Load Testing
      • What is the number of concurrent (synthetic) users supported by the server?
      • What is the number of concurrent (synthetic) users need to generate sustained server load to max out mem cache?
      • What are the performance metrics to be measured to determine server load?
      • What is the duration of a load test?
      • What are the steps required to run the load generating tests?
      • On what systems will the tests be performed?
      • What and how is performance data gathered and reported from the load generating clients?
      • What are the weave scenarios tests the live users to perform?
      • How will the live users report performance data?