MetricsDataPing: Difference between revisions

UUID
(UUID)
Line 34: Line 34:
<br>
<br>
The list and definitions of data elements in the Metrics Ping is here [https://metrics.etherpad.mozilla.org/ep/pad/view/ro.9$yFtH/latest MDP Data Point Descriptions]
The list and definitions of data elements in the Metrics Ping is here [https://metrics.etherpad.mozilla.org/ep/pad/view/ro.9$yFtH/latest MDP Data Point Descriptions]
<br> <br>
 
== UUID ==
 
'''Document Identifier Strategy'''
'''Document Identifier Strategy'''


Each profile will generate a UUID to be used as the document key.  Each day's submission will use that UUID, and this will also be the key for that profile's cumulative data on the server.  When each submission is received, the server merges it on the fly with the cumulative data, not persisting the individual documents.
Each profile will generate a UUID to be used as the document key.  Each day's submission will use that UUID, and this will also be the key for that profile's cumulative data on the server.  When each submission is received, the server merges it on the fly with the cumulative data, not persisting the individual documents.
=== Privacy ===
A UUID *is* PII. Definition:
"Personally Identifiable Information (PII), as used in information security, is information that can be used to uniquely identify, contact, or locate a single person or can be used with other sources to uniquely identify a single individual."
An stable UUID for a user or user device is always a PII and never anonymous per definition.
It is therefore regulated by European data protection laws and normally forbidden.
From a user standpoint, it is irrelevant whether and how Mozilla uses the data, only that the data is sent. There can be
* interceptions during transmission
* other logging server components before the server component discussed here
* legal requests by various governments
* server break-ins, or
* policy changes on the Mozilla side.
The user has no way to verify whether any of that is happening or not, and that already is a privacy violation. So, it's irrelevant what the intended usage was, only what is theoretically possible.
Having a UUID would allow, for example, to track all my dynamic IP addresses over time, and allow to build a profile, when combined with access logs. If I have a notebook or mobile browser, it would even allow to track the places where I go based on IP geolocation / whois data.
=== Google Chrome ===
Google Chrome did use a UUID for each browser, and it was perceived as a serious privacy threat and a topic going through mainstream press (including the largest newspapers) in Germany. Eventually, Google dropped the UUID.
This question of whether a UUID is used by Firefox *will* be picked up by the press, and the result will be negative for Firefox. This is not a guess, as history shows.
=== Perception ===
Germany and Europe are very privacy-aware, much more so than people in the US. Firefox has a big and loyal following there, to a big part because Firefox claims to do what users want and is privacy-aware. A UUID will be considered highly offensive in these countries and will cost Firefox market-share, this is nearly certain.
=== Alternative ===
Instead of building the history on the server, the client should build the history and only submit results. E.g. if you need to know whether things improved, you can let the client keep some old data and submit "12 crashes last week. One week before: 12% more. One year before: 50% less." It should not include exact history numbers either, because they, too, would allow to puzzle the numbers together and allow to again build a history of IP addresses for a given user.


== Client-side<br>  ==
== Client-side<br>  ==
Confirmed users
596

edits