Security/Reviews/MetricsDataPing

Please use "Edit with form" above to edit this page.

Item Reviewed

Metrics Data ping

Target

https://wiki.mozilla.org/MetricsDataPing

ID	Summary	Priority	Status
718066	Initial landing of Firefox Health Report	--	RESOLVED

1 Total; 0 Open (0%); 1 Resolved (100%); 0 Verified (0%);

The given value "https://wiki.mozilla.org/MetricsDataPing

Full Query

ID	Summary	Priority	Status
718066	Initial landing of Firefox Health Report	--	RESOLVED

1 Total; 0 Open (0%); 1 Resolved (100%); 0 Verified (0%);

" contains strip markers and therefore it cannot be parsed sufficiently.

Introduce the Feature

Goal of Feature, what is trying to be achieved (problem solved, use cases, etc)

MetricsDataPing- get important metrics (see wiki)
- this data is a criticial need for Moz for a variety of reasons
orig plans focused around a collection of metrics on client side to Moz servers once per day
- for effective retention data and longitudinal study we need a cumulative view (over time)
- initial proposal a UUID associated with installations profile, submitted each time so it can be merged with past data
- the data set is opt-out vs opt-in to avoid self selection bias
changes
- UUID removed and replaced with a document identifier, generated per request (per profile)
- data accumulated client side vs. server side
- sent with new ID and previous ID, which allows us to remove the older documents with the old ID

What solutions/approaches were considered other than the proposed solution?

UUID vs. Document ID (above)
blocklist ping - provides ADI, current metrics system, lots of attributes, only point in time, no time analysis; owners don't want other data collection on top, no retention analysis
telemetry - default opt-out nightly/aurora, but opt-in on others, focused on preformance data; not designed for time analysis, or retention
Test Pilot - double opt in, large self selection bias, skewed towards power user or early adopter not typical user
opt in vs. opt out - based on research bias on self selection
funnelcake - designed for adoption/retention, blocklist ping was the last part, but we were lossing this data
an actual representative "sample" rather than the full population
- problem of keeping the sample stable and representative over time

Why was this solution chosen?

need for longitudinal analysis & retention analysis
we can look at them and see if there were problems if data stops coming

Any security threats already considered in the design and why?

UUID could be used if disclosed to find information about the user from the server system
- this would persist across a backup, thus changed
Server side:
- public unauthenticated system, write only or request to delete

Threat Brainstorming

Obvious Privacy stuff
why does the system have retrieval?
- current system with document identifiers does not, maybe in a future version to allow client to get aggregate info so a user can compare things themselves
- so user can see the data and remove if they want
Does the about:metrics / user data retrieval feature have to go out at the same time as the metrics collection on our servers?
What are the compliance issues mentioned on the wiki in regards to the data retrieval?
- EU / Ger: privacy compliance regulations, even data about the functioning of the product without a user facing feature to support it
Where is the uuid/document identifier stored? Do webpages have access to UUID/Docuemtn ID?
- Stored as a preference in about:config - accessible to "chrome" code, not regular web pages. Hence website fingerprinting not an issue.
- a user could mess up the data by fiddling with about:config, could cause bogus data
How often will data be sent?
- not more than once per 24 hours
What API is used?
- simple post request to data collection system, same as telemetry.(data.mozilla.com)
Are there signatures on the request/responses? Is it over ssl?
- yes over SSL, signature does not matter
is the certificate checked or basic SSL auth?
- basic SSL Auth. Perhaps we could extend this.
When the server receives a new Document ID, it deletes the previous ID and data associated with it. Do we no longer need that data, or do we just delete the previous ID and retain the data?
- each submission is a cumulative view from the client, there is only one doc at any time that represents that installation
- allows for expiration of documents
What's the risk of other add-ons grabbing and using the Document ID as a unique identifier, much as iOS apps have been caught doing?
- document ID changes every day, so not likely useful to other chrome privelaged processes unless they check all the time
  - but the add-on can just grab the document ID everyday and chain them.
    - if they had chrome privileges, they could just create an uuid themselves and use it anyway .
how random are the document IDs
- uses UUID mechanism, same as crash stats

Property "SecReview feature goal" (as page type) with input value "* MetricsDataPing- get important metrics (see wiki)
- - this data is a criticial need for Moz for a variety of reasons
- orig plans focused around a collection of metrics on client side to Moz servers once per day
  - for effective retention data and longitudinal study we need a cumulative view (over time)
  - initial proposal a UUID associated with installations profile, submitted each time so it can be merged with past data
  - the data set is opt-out vs opt-in to avoid self selection bias
- changes
  - UUID removed and replaced with a document identifier, generated per request (per profile)
  - data accumulated client side vs. server side
  - sent with new ID and previous ID, which allows us to remove the older documents with the old ID" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  - Property "SecReview alt solutions" (as page type) with input value "* UUID vs. Document ID (above)
- blocklist ping - provides ADI, current metrics system, lots of attributes, only point in time, no time analysis; owners don't want other data collection on top, no retention analysis
- telemetry - default opt-out nightly/aurora, but opt-in on others, focused on preformance data; not designed for time analysis, or retention
- Test Pilot - double opt in, large self selection bias, skewed towards power user or early adopter not typical user
- opt in vs. opt out - based on research bias on self selection
- funnelcake - designed for adoption/retention, blocklist ping was the last part, but we were lossing this data
- an actual representative "sample" rather than the full population
  - problem of keeping the sample stable and representative over time" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  - Property "SecReview solution chosen" (as page type) with input value "* need for longitudinal analysis & retention analysis
- we can look at them and see if there were problems if data stops coming" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
- Property "SecReview threats considered" (as page type) with input value "* UUID could be used if disclosed to find information about the user from the server system
  - this would persist across a backup, thus changed
- Server side:
  - public unauthenticated system, write only or request to delete" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  - Property "SecReview threat brainstorming" (as page type) with input value "* Obvious Privacy stuff
- why does the system have retrieval?
  - current system with document identifiers does not, maybe in a future version to allow client to get aggregate info so a user can compare things themselves
  - so user can see the data and remove if they want
- Does the about:metrics / user data retrieval feature have to go out at the same time as the metrics collection on our servers?
- What are the compliance issues mentioned on the wiki in regards to the data retrieval?
  - EU / Ger: privacy compliance regulations, even data about the functioning of the product without a user facing feature to support it
- Where is the uuid/document identifier stored? Do webpages have access to UUID/Docuemtn ID?
  - Stored as a preference in about:config - accessible to "chrome" code, not regular web pages. Hence website fingerprinting not an issue.
  - a user could mess up the data by fiddling with about:config, could cause bogus data
- How often will data be sent?
  - not more than once per 24 hours
- What API is used?
  - simple post request to data collection system, same as telemetry.(data.mozilla.com)
- Are there signatures on the request/responses? Is it over ssl?
  - yes over SSL, signature does not matter
- is the certificate checked or basic SSL auth?
  - basic SSL Auth. Perhaps we could extend this.
- When the server receives a new Document ID, it deletes the previous ID and data associated with it. Do we no longer need that data, or do we just delete the previous ID and retain the data?
  - each submission is a cumulative view from the client, there is only one doc at any time that represents that installation
  - allows for expiration of documents
- What's the risk of other add-ons grabbing and using the Document ID as a unique identifier, much as iOS apps have been caught doing?
  - document ID changes every day, so not likely useful to other chrome privelaged processes unless they check all the time
    - but the add-on can just grab the document ID everyday and chain them.
      - if they had chrome privileges, they could just create an uuid themselves and use it anyway .
- how random are the document IDs
  - uses UUID mechanism, same as crash stats" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Action Items

Action Item Status

In Progress

Release Target

Firefox 12

Action Items

Who	Action	By When	Completed date
	code reveiw (about:metrics) bug 718066	before landing on Aurora	[NEW] in progress

Full Query

ID	Summary	Priority	Status
764645	SecReview: Firefox Health Report - Security Code Review	--	RESOLVED

1 Total; 0 Open (0%); 1 Resolved (100%); 0 Verified (0%);

The given value "

WhoActionBy WhenCompleted date

code reveiw (about:metrics) bug 718066before landing on Aurora[NEW] in progress

Full Query

ID	Summary	Priority	Status
764645	SecReview: Firefox Health Report - Security Code Review	--	RESOLVED

1 Total; 0 Open (0%); 1 Resolved (100%); 0 Verified (0%);

" contains strip markers and therefore it cannot be parsed sufficiently.

Security/Reviews/MetricsDataPing

Item Reviewed

Introduce the Feature

Goal of Feature, what is trying to be achieved (problem solved, use cases, etc)

What solutions/approaches were considered other than the proposed solution?

Why was this solution chosen?

Any security threats already considered in the design and why?

Threat Brainstorming

Action Items

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

How to Contribute

MozillaWiki

Around Mozilla

Tools