|Projected Feature Freeze Date:||On train for 7|
|Product Champions:||Taras Glek|
|Privacy Champions:||Sid Stamm, Asa Dotzler|
|Security Contact:||Curtis Koenig|
|Architectural Overview:||[DONE] 28-April-2011|
|Recommendation Meeting:||[DONE] 18-May-2011|
|Wrap-up Meeting:||(if necessary)|
In this section, the product's architecture is described. Any individual components or actors are identified, their "knowledge" or what data they store is identified, and data flow between components and external entities is described.
The main objective of this feature/product is: to allow Engineering to receive aggregate data of browser health in the field. Think cache hit rates, page load times across all browser instances or anything else we're interested in.
Design Documents: Link to any design or architectural documents here.
- UI components bug]
- EtherPad Design Doc
- Feature Page
- bug 585196: Infrastructure Implementation Bug for Telemetry
- Chromium code similar to our plan
This document does not discuss individual measurements collected by the Telemetry infrastructure, but rather just the framework and feature-scope itself. For discussion of the specific measurements, see the measurements list .
Describe any major components in the system and how they interact. Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.
Client Component (Firefox)
This component gathers metrics and uploads counters and histograms to the Telemetry server.
The tables below simply summarize the data encountered by this component.
| There are 2 kinds of metrics:
1) Metrics that are recorded while performing an operation. Ie startup decisions/timing, cycle collection timing
2) Data that are polled for every time the browser is idle for more than a minute. In Taras' experience, data are gathered a couple times an hour. Currently we poll various about:memory fields. Other things to poll for: number of open tabs, sizes of key sqlite databases, sizes of cache, etc.
|All telemetry data is stored in memory. Upon shutdown, some measurements are recorded and persisted to disk for a short period of time. When Firefox starts again and a ping is sent, persisted measurements are transmitted and then erased.|
Communication with Server Component
|In:||ACK||HTTP 200/OK (no additional data)|
|Out:||HTTP POST to /submit/telemetry/||text/plain JSON-encoded object containing historgrams and counters|| The types of data represented by histograms and counters will change over
time and the submission will contain a unique ID (nonce to identify strange duplicate submissions). Pings are once per day.
This component receives metrics from the Client Component and creates visualizations and queries for Mozilla people.
The tables below simply summarize the data encountered by this component.
|data type||Entire ping is store in a json log. Client ip is added to the log. See security review in bug 655746.|
Communication with Client Component
|In:||HTTP POST||Telemetry Data||(see above)|
|Out:||ACK||HTTP 200/OK||(no additional data)|
User Data Risk Minimization
In this section, the privacy champion will identify areas of user data risk and recommendations for minimizing the risk.
Fingerprinting / Tracking
Based on metrics that are similar from day to day, an individual user might be fingerprinted and tracked across time. Someone with consistent day-to-day browsing habits may have the same memory usage, speed, etc; it is likely that the machine's attributes will also have an effect on the measurements taken so a combination of browsing habits and machine attributes could be a fairly detailed "fingerprint". It is important to identify and eliminate duplicate entries, however, so some unique ID must be maintained for a short window of time.
Required Action: To minimize fingerprinting risk, it is crucial to ensure that arbitrary web sites absolutely cannot access the telemetry data while it's stored on the client. Additionally, the data should be transmitted from the Client Component to the Server Component over a secured (and preferably authenticated) channel; this means SSL/HTTPS must be used. Any data that is no longer needed should be erased from our servers, and a unique ID used for duplicate elimination should be short-lived.
Recommendation: If possible, the SSL certificate fingerprint should be hard coded into the client and verified before transmitting data so the client can be sure the server where it is sending data is indeed the Telemetry server (and not an attacker intercepting traffic).
Conformity to Private Browsing Mode
Private browsing is intended to protect from someone who has local access to the browser from knowing what you did in private browsing mode. Since Telemetry collects data that is ultimately affected by how the user browses the web, any data collected should not be retained persistently through private browsing mode.
Some measurements need to be persisted to disk because they are only available during shutdown (e.g., measuring how long it takes to shut down plugins). Any measurements taken between ping and shutdown are persisted to disk upon application shutdown. Telemetry ping code checks for stored data when sending it to the server, then after successfully sending it the data is erased and the telemetry "state" is reset. From Bug 707320 comment 2:
a) if there is no serialized telemetry data, send a ping same as we do now b) if there is serialized data: b1) send serialized data b2) reset UID, wipe all histograms
Recommendations: Telemetry should be disabled in private browsing mode. If nothing else, new measurements must not be stored on disk or other non-volatile storage devices while the client is in private browsing mode. Any measurements taken during private mode should be erased from memory when private mode is exited.
Alignment with Privacy Operating Principles
In this section, the privacy champion will identify how the feature lines up with Mozilla's privacy operating principles.
Principle: Transparency / No Surprises: People should know that the metrics are being gathered and submitted. This feature is opt-in, though it's not clear whether or not people fully understand what type of data they're letting us collect.
Required Action: It should be clear in the UI what we collect and how we collect it. For example, Test Pilot asks the user to approve not only data collection but also data submission, providing information about what's being collected or submitted at the time it begins. Telemetry should do something similar to make it very explicit what is being collected as well as when it's being submitted.
Principle: Real Choice: Users of this system should not only understand what it does, but be able to choose whether or not to participate.
Required Action: It should be clear in the UI what we collect and how we collect it.
Principle: Sensible Defaults: Telemetry is off by default and is opt-in.
Principle: Limited Data: Telemetry should only collect data that we will actually use for improvements to the product. All data that's collected should be backed up with clearly stated reasons.
Required Action: Maintain a table of counters and histograms that are gathered and reasons for collecting it. The table should be revisited regularly to identify unnecessary metrics and we should stop collecting those.
Follow-up Tasks and tracking
|[DONE] Initial Overview Discussion||Sid and Taras||Meeting 28-April-2011|
|[DONE] Discuss Missing Information||Sid and Taras||Via IRC 17-May-2011|
|[DONE] Discuss risks and recommendations||Sid and Taras||Scheduled 18-May-2011|
|[DONE] Create list of gathered metrics that reflects current state of collected data||Taras||
||Also see Performance/Telemetry, Telemetry Measurements|
|[DONE] Change string for opt-in UI to describe what type of data is collected||Taras||
|[DONE] Implement strict private browsing conformity (stop recording when on-enter-private-browsing).||Taras||
|[DONE] Create privacy discussion framework for adding new metrics without heavyweight review||Sid and Taras||Rapid risk analysis template at Privacy/Reviews/Telemetry/Measurements, most review done in bugs and cataloged in the Measurements list.|
|[DONE] Document change to persist some Telemetry measurements to disk||Sid||bug 707320||Some stuff has to be persisted to disk; it will be done in a way that prohibits tying together sessions easily and the stored data will be deleted from the client once it has been transmitted to the Telemetry server.|
|[DONE] Implement about:telemetry to show users what's being collected.||Taras||bug 661881||Landed in Firefox 19.|