Auto-tools/Projects/OrangeFactor/ElasticSearch

From MozillaWiki
Jump to: navigation, search

The data for bugs and logs is stored in ElasticSearch.

Documentation

The ElasticSearch query language is described at [1].

Bug Data

Data for a single occurrence of a bug looks like this:

 _source: {
   buildname: "Rev3 MacOSX Leopard 10.5.8 mozilla-central opt test mochitest-other"
   machinename: "talos-r3-leopard-038"
   os: "osx"
   date: "2010-12-14"
   type: "Mochitest"
   debug: "false"
   starttime: "1292378723"
   logfile: "1292378723.1292379802.29699.gz"
   tree: "mozilla-central"
   rev: "abe884259481"
   who: "philringnalda@gmail.com"
   bug: "614643"
 }

Bug Queries

Some sample bug queries are shown below. Most ES queries are conducted by sending a JSON document via an HTTP GET to the db; the JSON document describes the search terms. You can execute these queries from the command-line using curl, which is the format used in these examples.

In ES, all queries will return a default of 10 matches at most. You can alter this behavior by changing the "from" and "size" fields in the queries below.

Get all bugs in date range for mozilla-central

 curl -XGET 'http://cm-metricsetl03.mozilla.org:9200/bugs/bug_info/_search?pretty=true' -d '
 {
   "from": 0, "size": 20,
   "query": {
     "filtered": {
       "query": { "field": { "tree": "mozilla-central" }},
       "filter": {
         "range": { 
           "date": { "from": "2010-12-21", "to": "2010-12-22" }
         }
       }
     }
   }
 }
 '

Get all occurrences of a specific bug in date range for m-c

 curl -XGET 'http://cm-metricsetl03.mozilla.org:9200/bugs/bug_info/_search?pretty=true' -d '
 {
   "from": 0, "size": 5,
   "query": {
     "filtered": {
       "query": {
         "bool": {
           "must": [
             { "field": { "bug": "614643" } },
             { "field": { "tree": "mozilla-central" } }
           ]
         }
       },
       "filter": { 
         "and": {
           "filters": [
             { "range": { "date": { "from": "2010-12-21", "to": "2010-12-22" } } }
           ]
         }
       }
     }
   }
 }
 '

Getting a count for a query

To get a count of items that will be returned by a query, you issue the query to a url that has _count in place of _search, and omits the "query" element which wraps the query (the "query" is implied for _count).

For example, the query to retrieve the first 10 records for all the bugs in mozilla-central looks like this:

 curl -XGET 'http://cm-metricsetl03.mozilla.org:9200/bugs/bug_info/_search?pretty=true' -d '
 {
   "from": 0, "size": 10,
   "query": {
     "field": { "tree": "mozilla-central" }
   }
 }
 '

To count the number of records that exist for this query, you would use:

 curl -XGET 'http://cm-metricsetl03.mozilla.org:9200/bugs/bug_info/_count?pretty=true' -d '
 {
   "field": { "tree": "mozilla-central" }
 }
 '

This might return:

 {
   "_shards" : {
     "total" : 5,
     "successful" : 5,
     "failed" : 0
   },
   "hits" : {
     "total" : 773,
     "max_score" : 1.7568314,
     "hits" : [ ]
   }
 }

The value of json["hits"]["total"] is the total number of items that exist for that query.