Auto-tools/Projects/Charts

From MozillaWiki
Jump to navigation Jump to search

charts.mozilla.org

Overview

The charts application is pure javascript running on the client side. It accesses a separate ElasticSearch cluster for data.

Architecture

Web Server

Auto tools Projects Charts Architecture.png

The application itself is served as a set of static html and javascript files from the Mozilla PAAS Stackato servers. There are two versions: Production and Staging, the latter will usually have more features, while being slightly more buggy. Please us the staging server as much as possible: Like the Nighly Firefox, using staging will help find bugs sooner.

Code for each version is in a separate branch

Server Source Code
production https://github.com/mozilla/charts
staging https://github.com/mozilla/charts/tree/allizom

Really, it does not matter where the application is served from (I have various development versions on my people page). If both these servers are down, a simple git clone can allow you to "serve" the app directly from your local filesystem.

esFrontline

esFrontline is a simple python based proxy to limit ES requests to search requests and limit the indexes exposed. Other than these restrictions, this proxy is invisible to client application. See https://wiki.mozilla.org/BMO/ElasticSearch for more details.

ElasticSearch Clusters

Once the application is downloaded it will attempt to contact both the private and public clusters simultaneously; whichever responds will be chosen for all future connections, with preference given to the private cluster. The queries for the dashboard are then sent to the cluster as the dashboard app requires.

The clusters are configured to accept requests from any client. Hopefully this will promote development of alternative dashboards and charts.

Development

The Development server is responsible for secondary indexes built from the main bug_version. Currently it maintains the hierarchy index for determining recursive dependencies on bugs.

Slowness

There are some sources of slowness:

  • esFrontline - being Python, and simple, may be adding about 1/4 second latency to all ES searches
  • Virtual Machines - The nodes of the ES clusters are hosted on VMs, and may be contributing to some slowness.
  • Code - Nothing is minimized, some pages even pause to load JavaScript dynamically.

Past Problems

CORS

The charts application makes cross-platform requests: The app is served from charts.mozilla.org and data requested from esfrontline.bugzilla.mozilla.org. This requires the various proxy servers (not shown in architecture) ensure the Access-Control-Allow-Origin HTTP response header be set appropriately. In the past it has been shown this header is stripped from the esFrontline response and set according to Operations' guidelines.

Clusters Down or Not Responding

ElasticSearch is still prone to OutOfMemoryExceptions. Occasionally, this will bring down nodes in the cluster. The best solution has always been to reboot the problem nodes (or all of them).

The chance of data loss is very low: First, the data is replicated 2 times (for a total of three copies). Second, the ETL daemon (responsible for filling the cluster) performs some simple data checks, and writes full history of all changed bugs: Effectively overwriting corruption on any bugs that does occur. Corruption on inactive bugs can linger; assuming corruption exists, and assuming a change happened when the cluster was misbehaving, and assuming the ETL did not detect the misbehavior. Third, there are consistency checks built into MoDevMetrics that monitors or consistency between bug versions, corruption is sometimes detected, and inevitably fixed by the next ETL run.