MetricsDataPing: Difference between revisions

no edit summary
No edit summary
 
(27 intermediate revisions by 4 users not shown)
Line 1: Line 1:
DEPRECATED: This proposal has been updated and the official project name is "Firefox Health Report".  Please see the following links for further discussion.
[https://groups.google.com/d/topic/mozilla.dev.platform/rOO1HGpAb9Q/discussion Post on dev.platform]
[https://blog.mozilla.org/metrics/2012/09/21/firefox-health-report/ Firefox Health Report blog post]
[https://blog.mozilla.org/metrics/firefox-health-report/fhr-faq/ Firefox Health Report FAQ]
= Description =  
= Description =  


Line 31: Line 41:
A directory of elements collected by the various data collection pings (Metrics Data Collection Ping, Blocklist, AUS Ping, Version Check Ping, Services AMO, Telemetry) can be found here: [https://metrics.etherpad.mozilla.org/ep/pad/view/ro.9e6LG/latest Data Collection Paths]<br>
A directory of elements collected by the various data collection pings (Metrics Data Collection Ping, Blocklist, AUS Ping, Version Check Ping, Services AMO, Telemetry) can be found here: [https://metrics.etherpad.mozilla.org/ep/pad/view/ro.9e6LG/latest Data Collection Paths]<br>
<br>
<br>
The list and definitions of data elements in the Metrics Ping is here [https://metrics.etherpad.mozilla.org/ep/pad/view/ro.9$yFtH/latest MDP Data Point Descriptions]
The list and definitions of data elements in the Metrics Ping is here [https://docs.google.com/spreadsheet/ccc?key=0AtdL1GrYQUbldFBBUUNkbTBKNjZTd3dTeTZ0QUhaNXc MDP Data Point Descriptions]


== Submission ID ==
== Submission ID ==
Line 44: Line 54:


Sample JSON output that is recieved mozilla server side:<br>  
Sample JSON output that is recieved mozilla server side:<br>  
<pre>2011/11/04:
<pre>Format updated 2012/02/01:
{
{
     "ver": 1,
     "ver": 2,
    "uuid": "e8a583fe-98ec-45be-9e44-96a23759067a",
     "lastPingTime": "2012-01-31T16:57:26.000Z",
     "lastPingTime": 1320340265,
     "thisPingTime": "2012-02-02T14:18:30.507Z",
    "thisPingTime": "2011-11-04T19:30:11.948Z",
     "currentTime": "2011-11-04T19:30:11.962Z",
     "env": {
     "env": {
         "reason": "idle-daily",
         "reason": "startup",
         "OS": "Linux",
         "OS": "Linux",
         "appID": "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",
         "appID": "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",
         "appVersion": "10.0a1",
         "appVersion": "12.0a1",
         "appVendor": "Mozilla",
         "appVendor": "Mozilla",
         "appName": "Firefox",
         "appName": "Firefox",
         "appBuildID": "20111104162615",
         "appBuildID": "20120202101451",
         "appABI": "x86_64-gcc3",
         "appABI": "x86_64-gcc3",
         "appUpdateChannel": "default",
         "appUpdateChannel": "default",
         "appDistribution": "default",
         "appDistribution": "default",
         "appDistributionVersion": "default",
         "appDistributionVersion": "default",
         "platformBuildID": "20111103103700",
        "appHotfixVersion": "",
         "platformVersion": "10.0a1",
         "platformBuildID": "20120126141109",
         "platformVersion": "12.0a1",
         "locale": "en-US",
         "locale": "en-US",
         "name": "Linux",
         "name": "Linux",
         "version": "2.6.38-12-generic",
         "version": "3.0.0-15-generic",
         "cpucount": 4,
         "cpucount": 4,
         "memsize": 7889,
         "memsize": 7889,
         "arch": "x86-64"
         "arch": "x86-64"
     },
     },
     "simpleMeasurements": {
     "addons": [
         "uptime": 0,
        {
         "main": 3,
            "id": "crashme@ted.mielczarek.org",
         "firstPaint": 629,
            "userDisabled": false,
         "sessionRestored": 502,
            "appDisabled": false,
         "isDefaultBrowser": false,
            "version": "0.3",
         "crashCountSubmitted": 1,
            "installDate": "2011-10-25",
         "profileAge": 31,
            "updateDate": "2011-10-25",
         "addonCount": 2,
            "type": "extension",
         "addons": [
            "hasBinaryComponents": false
             {
        },
                "id": "crashme@ted.mielczarek.org",
         {
                "appDisabled": false,
            "id": "ping.telemetry@mozilla.com",
                "version": "0.3",
            "userDisabled": false,
                "installDate": "2011-10-25T15:02:03.000Z",
            "appDisabled": false,
                "updateDate": "2011-10-25T15:02:03.000Z"
            "version": "0.5",
            "installDate": "2011-11-16",
            "updateDate": "2011-12-20",
            "type": "extension",
            "hasBinaryComponents": false
        },
         {
            "id": "about.blank@mozilla.com",
            "userDisabled": false,
            "appDisabled": false,
            "version": "0.5",
            "installDate": "2012-01-27",
            "updateDate": "2012-02-01",
            "type": "extension",
            "hasBinaryComponents": false
        },
         {
            "id": "{e2c52c1c-5ee1-cc23-15fa-35945fd58806}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "1.0.0.0",
            "installDate": "2012-01-26",
            "updateDate": "2012-01-26",
            "type": "plugin"
        },
         {
            "id": "{18965679-bddd-de62-52b4-b56e6316d854}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-12-13",
            "updateDate": "2011-12-13",
            "type": "plugin"
        },
         {
            "id": "{5c0830c7-3003-fc43-0daf-d29b579f5f6b}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-12-13",
            "updateDate": "2011-12-13",
            "type": "plugin"
        },
         {
            "id": "{ed5c33eb-95c1-f1be-50ba-eb0ade42d912}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-12-02",
            "updateDate": "2011-12-02",
            "type": "plugin"
        },
         {
            "id": "{79eb71d7-19b2-ef97-2247-9a8960804972}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-11-08",
            "updateDate": "2011-11-08",
            "type": "plugin"
        },
         {
            "id": "{f2d261dc-c5c4-ca3c-ae02-ccb3ff227c7f}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-10-18",
            "updateDate": "2011-10-18",
            "type": "plugin"
        },
         {
            "id": "{84e372b2-f1a3-032a-0001-b725ee38d1ed}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-10-18",
            "updateDate": "2011-10-18",
             "type": "plugin"
        },
        {
            "id": "{fcdc99b5-45d4-daeb-9239-0a41e6c9b7ce}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-10-18",
            "updateDate": "2011-10-18",
            "type": "plugin"
        },
        {
            "id": "{24f3e033-1a7c-ae8b-3fc7-4ac494c18e91}",
            "userDisabled": false,
            "appDisabled": false,
            "version": "",
            "installDate": "2011-10-18T",
            "updateDate": "2011-10-18",
            "type": "plugin"
        }
    ],
    "currentSessionTime": 60,
    "currentSessionActiveTime": 50,
    "dataPoints": {
        "2012-02-02": {
            "search": {
                "searchbar": {
                    "Google": 1
                },
                "abouthome": {
                    "Google": 1
                }
            },
            "sessions": {
                "completedSessions": 1,
                "completedSessionTime": 567,
                "completedSessionActiveTime": 115
            },
            "simpleMeasurements": {
                "uptime": 2,
                "main": 184,
                "firstPaint": 1039,
                "sessionRestored": 903,
                "isDefaultBrowser": false,
                "crashCountSubmitted": 0,
                "profileAge": 121,
                "placesPagesCount": 508,
                "placesBookmarksCount": 77,
                "addonCount": 14,
                "version": "12.0a1"
            }
        },
        "2012-02-01": {
            "simpleMeasurements": {
                "uptime": 2,
                "main": 17,
                "firstPaint": 582,
                "sessionRestored": 468,
                "isDefaultBrowser": false,
                "crashCountSubmitted": 0,
                "profileAge": 120,
                "placesPagesCount": 456,
                "placesBookmarksCount": 76,
                "addonCount": 14,
                "version": "12.0a1"
             },
             },
             {
             "sessions": {
                 "id": "mozmetrics@mozilla.org",
                 "completedSessions": 10,
                 "appDisabled": false,
                "completedSessionTime": 5852,
                 "version": "0.1",
                 "completedSessionActiveTime": 780,
                 "installDate": "2011-10-11T14:59:08.000Z",
                 "abortedSessions": 1,
                 "updateDate": "2011-10-26T13:26:45.000Z"
                 "abortedSessionTime": 222,
                 "abortedSessionActiveTime": 55
             }
             }
         ]
         },
    },
        "2012-01-31": {
    "events": {
            "search": {
        "search": {
                "abouthome": {
            "abouthome": {
                    "Google": 1
                "Google": 1
                }
             },
             },
             "searchbar": {
             "sessions": {
                 "Google": 3,
                 "completedSessions": 3,
                 "Amazon.com": 1,
                 "completedSessionTime": 73096,
                 "Other": 1
                 "completedSessionActiveTime": 310
             },
             },
             "urlbar": {
             "simpleMeasurements": {
                 "Google": 1
                 "uptime": 2,
                "main": 10,
                "firstPaint": 566,
                "sessionRestored": 446,
                "isDefaultBrowser": false,
                "crashCountSubmitted": 0,
                "profileAge": 119,
                "placesPagesCount": 452,
                "placesBookmarksCount": 77,
                "addonCount": 14,
                "version": "12.0a1"
             }
             }
         },
         },
         "sessions": {
         "2012-01-30": {
            "completedSessions": 16,
            "sessions": {
            "completedSessionTime": 829,
                "completedSessions": 10,
            "completedSessionActiveTime": 535,
                "completedSessionTime": 2202,
            "abortedSessions": 2,
                "completedSessionActiveTime": 640,
            "abortedSessionTime": 7,
                "abortedSessions": 2,
            "abortedSessionActiveTime": 15,
                "abortedSessionTime": 60,
             "abortedSessionAvg": 4,
                "abortedSessionActiveTime": 50
             "abortedSessionMed": 4,
            },
            "currentSessionActiveTime": 10,
             "search": {
            "currentSessionTime": 20,
                "abouthome": {
            "aboutSessionRestoreStarts": 0
                    "Google": 1
        },
                },
        "corruptedEvents": 0
                "searchbar": {
     }
                    "Bing": 1,
                    "Other": 1,
                    "Google": 2
                }
            },
             "simpleMeasurements": {
                "uptime": 2,
                "main": 11,
                "firstPaint": 698,
                "sessionRestored": 564,
                "isDefaultBrowser": false,
                "crashCountSubmitted": 0,
                "profileAge": 118,
                "placesPagesCount": 479,
                "placesBookmarksCount": 74,
                "addonCount": 14,
                "version": "12.0a1"
            }
        }
     },
    "currentTime": "2012-02-02T14:19:30.522Z"
}
}
</pre>
</pre>


== Server-side ==


*Clients will POST data to the configured URL not more than once every 24 hours.
*Clients will POST data to the configured URL not more than once every 24 hours.
*The first timer check should be one minute after startup.
*The first timer check should be one minute after startup.
*The POST data will consist of a JSON document containing a document ID and all the metrics that were collected since the last submission.
*The POST data will consist of a JSON document containing a document ID and all the metrics that were collected since the last submission.
*The server side will receive the POST request and perform GeoIP location on the IP address. ''The raw IP will never be stored. ''The GeoIP data and submission timestamp will be added to the JSON document.
 
== Server-side ==
*The server side will receive the POST request and perform GeoIP location on the IP address. ''The raw IP will never be stored in the document.'' The GeoIP data and submission timestamp will be added to the JSON document.
*The server will store the JSON document into a daily staging collection with all other documents received during that date, UTC.
*The server will store the JSON document into a daily staging collection with all other documents received during that date, UTC.
*The server will return an HTTP response to the client indicating success of the storage and a document ID. For the initial feature release, this ID will be the same as the one passed in (i.e. an installation GUID). It can easily be changed to be new each time (i.e. a document GUID). If the ID is new, the client should store it to be returned on the next submission.
*If the POST request contains a header with the ID of a previously submitted document, the server will delete that old document as part of the transaction of storing the new one.
*The server will return an HTTP response to the client indicating success of both the deletion of the old document and storage of the new document.
*In the future this response might also include instructions to the client for things such as changing timing or MetricsDataPing configuration.
*In the future this response might also include instructions to the client for things such as changing timing or MetricsDataPing configuration.
*Asynchronously, the server will retrieve a document with the same document ID from the "latest" bucket if one exists and will insert/update the "latest" bucket with a merged document that does not include any metrics we wish to avoid collecting longitudally per installation such as GeoIP. This "current" bucket is used to perform retention analysis since it will have the last submitted data for any installation even if it is no longer in use. We will set a retention policy for when these inactive installation documents shall be deleted from the "latest" bucket.
*Longitudinal data for 6 months (e.g. intensity of use) is stored cumulatively in the JSON objects indexed by document ID. ''Documents older than 6 months are deleted.''
*Longitudinal data for 6 months (e.g. intensity of use) is stored cumulatively in the JSON objects indexed by GUID. ''Anything older than 6 months is deleted.''
*''At the end of the day, UTC, the server will aggregate all the documents submitted on that date and store the aggregate data (with no document IDs) in aggregate history tables in our data warehouse.'' (In subsequent releases of the MDP project, there will be a public API to retrieve information from these aggregate views to support additional analysis by users and the community.
*''At the end of the day, UTC, the server will aggregate all the documents submitted on that date and store the aggregate data (with no installation ID) in aggregate history tables in our data warehouse.''
*''There will be UI elements inside of Firefox that allows users to delete all their data (remote and locally.''


= Data Access Policies =
= Data Access Policies =
Line 150: Line 331:
* Must be a member of the metrics team
* Must be a member of the metrics team
* Must have an SSH account with LDAP integrated key
* Must have an SSH account with LDAP integrated key
* Must have MPT-VPN access
* Must have VPN access


= User Data =
= User Data =
Line 163: Line 344:
= UI Implementation =
= UI Implementation =


The Metrics Team is consulting with UX to determine the proper UI implementation.&nbsp; Given the opt-out requirement, UX proposes a check box to opt-out in the preferences pane and notifying users through non-modal and non-chrome channels (blog posts, privacy policies, download pages).
The Metrics Team is consulting with UX to determine the proper UI implementation of the preference to control data submission. Given the opt-out requirement, UX proposes a check box to opt-out in the preferences pane and notifying users through non-modal and non-chrome channels (blog posts, privacy policies, download pages).


see: https://bugzilla.mozilla.org/show_bug.cgi?id=707970
see: https://bugzilla.mozilla.org/show_bug.cgi?id=707970
Below are some wireframes showing potential UI layouts of the about:metrics page that users can use to review the data and use it for analysis of their own installation.
[[image:about-metrics-initial-gt800px.png|left|thumb|200px]]
[[image:about-metrics-initial-lt800px.png|left|thumb|200px]]
[[image:about-metrics-analysis-gt800px.png|left|thumb|200px]]
[[image:about-metrics-analysis-lte800px.png|left|thumb|200px]]
<br style="clear:both;" />


= Security Reviews =
= Security Reviews =


Review for Bagheera, the back end server that recieves and stores user data: [https://bugzilla.mozilla.org/show_bug.cgi?id=655746 https://bugzilla.mozilla.org/show_bug.cgi?id=655746]  
Review for Bagheera, the back end server that receives and stores user data: [https://bugzilla.mozilla.org/show_bug.cgi?id=655746 https://bugzilla.mozilla.org/show_bug.cgi?id=655746]
 
'''The below feedback was provided by [[User:BenB]] and was retained on this page at his request.  Discussion regarding these points are available here: [[Talk:MetricsDataPing]]


= Privacy =
= Privacy =
Line 222: Line 414:
* policy changes on the Mozilla side.
* policy changes on the Mozilla side.


Having a UUID would allow, for example, to track all my dynamic IP addresses over time, and allow to build a profile, when combined with access logs. If I have a notebook or mobile browser, it would even allow to track the places where I go based on IP geolocation / whois data.
Having a UUID would allow, for example, to track all my dynamic IP addresses over time. That would allow to track my notebook or mobile and thus the places where I go based on IP geolocation / whois data. For example, you could see that I am normally in my little village with 3000 households, but suddenly I appear at IBM headquarters, so clearly I am working for or consulting them. Or you could see who exactly my friends are, because my device appears on the same IP address as theirs, and you can even see roughly how often I am there or they in my place.


The user has no way to verify whether any of the above (break-in, intercept, intended or lawful or not) is happening or not, and that already is a privacy violation. So, it's irrelevant what the intended usage was, only what is theoretically possible. The above must be impossible - not just "We won't do it, we promise!", but impossible.
Even if you are not collecting the data right now, the user has no way to verify whether any of the above (break-in, intercept, intended or lawful or not) is happening or not, and that already is a privacy violation. So, it's irrelevant what the intended usage was, only what is theoretically possible. The above must be impossible - not just "We won't do it, we promise!", but impossible.


=== Google Chrome ===
=== Google Chrome ===
Line 247: Line 439:


The current proposal changed a stable UUID for a profile to a submission ID. However, the previous submission ID is also transferred, which allows the server to trivially match them together and still build a unique ID on the server. (Again, whether the server does that or not is immaterial.) So, the submission ID proposal has the same privacy consequences discussed above.
The current proposal changed a stable UUID for a profile to a submission ID. However, the previous submission ID is also transferred, which allows the server to trivially match them together and still build a unique ID on the server. (Again, whether the server does that or not is immaterial.) So, the submission ID proposal has the same privacy consequences discussed above.
= Anonymous alternative =
The following is an alternative approach, proposed by Ben Bucksch:
For simplicity, I will take the number of crashes (e.g. in the last week or overall) as data point that you want to gather. The data itself is anonymous and can (apart from fingerprinting, more to that later) not identify a single user.
== Avoiding UUID ==
You wanted to know which profiles are not used anymore (dormant, retention problem) and which characteristics they have. This is inherently difficult without tracking individual users (installations), but it is possible with the following algo:
The client submits:
* Date of last submission - e.g. 2012-01-18
* Current date (from client perspective) - only date, not time - e.g. 2012-01-20
* Age of profile (Firefox installation) in days - e.g. 500
* (Last submitted age is implied or explicit - e.g. 498 )
* Number of crashes - e.g. 15
* Number of crashes submitted last time - e.g. 10
Then, on the server, you write that information in a database, as such:
Date of submission | Age of installation | Crash count | Number of users
2012-01-20        | 500                | 15          | 100000
Any additional user also submitting today the same combination "age 500, crash count 15" increases the "number of users" column by 1, new value is 100001.
Also, you look up the row for the last submission, namely
2012-01-18        | 498                | 10          | 20000
and decrease the number of users by 1, new value is 19999.
If the user later that day decided that there were too many crashes and switches to Chrome, he will now be stranded on the row
2012-01-20        | 500                | 15          | 5000
while other users who have continued to use FF have been subtracted after a while. So, you can say with certainty that there were 5000 users who used Firefox the last time on 2012-01-20, after having used Firefox for 500 days, and they had 15 crashes (per day/week/total, whatever you submit) when they stopped using Firefox.
That is exactly the information you are so desperately seeking. Tsere, you has it. Without tracking any individual user: it's completely anonymous.
== Avoiding Fingerprinting ==
Now, what about all the other information that you need: startup times, addons, etc.? If we just add all that information to the same table and row, it would allow fingerprinting. But that is not necessary. You merely make one table per atomic information. I.e.
Table A
Date of submission | Age of installation | Crash count | Number of users
Table B
Date of submission | Age of installation | Startup time | Number of users
or of course whatever other database schema you want, as long as each value is separate. That takes care of the fingerprinting.
At least on the server side, not on the submission side. I would have to trust you, and anything between you and me. It would be possible to separate the calls and submit each value separately, but I think that would be overdoing it.
131

edits