Changes

Jump to: navigation, search

Data Publishing

1,529 bytes added, 23:09, 18 September 2020
no edit summary
* Plumb it in to the public facing dataset infrastructure, including metadata that links the public data back to the above review bug.
* Once the dataset has been published, it will be announced on the new Data @ Mozilla blog. It will also be added to https://docs.telemetry.mozilla.org/datasets/.
 
<big>'''Definitions'''</big>
 
'''Metric''' - A metric is anything we want to measure.
Examples: the number of clients that used the developer tools console, the number of active clients
 
'''Dimension''' - A dimension is a qualitative value such as OS, channel, or date. In practice, a dimension often defines a sub-population on which we can calculate a metric, allowing us to segment the metric for further analysis.
Examples: if we have an OS dimension, we can analyze the number of active clients by OS;
 
'''Aggregate''' - A combined value of many measurements (metric values), typically grouped by dimension or sets of dimensions. See also Aggregate Data.
 
'''Individual-level Data''' - Data containing a dimension which uniquely identifies a single profile, user, client, etc.
 
'''Tabular Data''' - Data that consists of rows (or records) and columns (or fields). Each row has the same number of columns, and each column represents a dimension or metric for that row. Think of a spreadsheet or CSV file as examples of this type of data.
 
<big>'''Example Data'''</big>
Here are some examples of data aggregated to the levels described above.
 
* Level 7: raw data, with fine-grained timestamps
* Level 6: individual-level data, aggregated to day-level time granularity
* Level 5: anonymized individual-level data, identifiers replaced with pseudonyms
* Level 4: probabilistic aggregates
* Level 3: dimension-level aggregates without a minimum group size
* Level 2: dimension-level aggregates with a minimum group size
39
edits

Navigation menu