Privacy/HowTo/Deidentify

< Privacy
Revision as of 22:00, 12 June 2012 by Smartin (talk | contribs) (Created page with "= De-Identifcation = Mozilla is an open company and we publish data for a variety of really good reasons. In some cases, we publish our own data from surveys and metrics. In ot...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

De-Identifcation

Mozilla is an open company and we publish data for a variety of really good reasons. In some cases, we publish our own data from surveys and metrics. In other cases, we receive requests from researchers to access our data. While publishing this data is important, we need to make sure that doing so doesn't compromise an individual's privacy. Please do not release any data until you've completed all of the steps on this page.

Checklist for Releasing Data

Things that can go wrong

Control Knobs

How to Request a Review

Definitions

  • Anonymization = Filtering a data set such that re-identification is impossible.
  • Fingerprinting = Selecting a bunch of attributes, which together are very distinctive, even if we don't know what it connects to in the real world yet. Ex: If we know the height, weight, body mass, hair color, facial measurements, etc., we can build a machine that will identify that person if you see them once. Fingerprinting is the basis of cold cases.
  • Identification or re-identification = Identifying some particular thing in the real world based on the data in the data set. Normally this is a person. Sometimes it's a device.