Changes

Data Publishing

286 bytes removed, 15:52, 22 September 2020

no edit summary

The goal of this process is to (1) make the “easy” (that is, safe) data publishing requests relatively friction-less, (2) have guard rails in-place so we don’t publish something that exposes us or our users to risk in some way, and (3) ensure that the dataset publishing request process matches closely other processes that are familiar to the data stewards.

Having a dataset published requires filling out a bug. ~~Use~~ Requests will use the nomenclature defined in the preceding sections to answer a series of questions including the following four ~~questions~~. If the answer to all of them is “no”, ~~you~~ the data may ~~publish~~be published. A “yes” above means extra review is required.

* Is the level of aggregation 3 or higher?

'''Tabular Data''' - Data that consists of rows (or records) and columns (or fields). Each row has the same number of columns, and each column represents a dimension or metric for that row. Think of a spreadsheet or CSV file as examples of this type of data.

<big>'''~~Example Data~~What's Been Published So Far?'''</big>~~Here are some examples of data aggregated to the levels described above.~~

* Level 7Our publicly available datasets are [https: ~~raw data, with fine-grained timestamps~~* Level 6: individual//public-~~level~~ data~~, aggregated to day-level time granularity~~* Level 5: anonymized individual-level data, identifiers replaced with pseudonyms* Level 4: probabilistic aggregates* Level 3: dimension-level aggregates without a minimum group size* Level 2: dimension.telemetry.mozilla.org/all-~~level aggregates with a minimum group size~~datasets.json here].

Agray

39

edits

Changes

Data Publishing

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

How to Contribute

MozillaWiki

Around Mozilla

Tools