MetricsDataPing: Difference between revisions

Jump to navigation Jump to search
Line 1: Line 1:
= Description =  
= Description =  


Measure adoption, retention, and aggregated search counts by engine. Record possible explanatory dimensions using a statistically unbiased and sound approach. Comparable projects that collect user data are TestPilot and Telemetry. Participants in these programs are self selected. It has been demonstrated that data retrieved from TestPilot is biased and not representative of the Firefox population.  
This project is centered around adding to Firefox the ability to measure adoption, retention, stability, performance, and aggregated search counts by engine. It records possible explanatory dimensions using a statistically unbiased approach. A key feature of the project is to enable the user to review, analyze, and remove the data collected about the browser if they desire.


'''Note''': The description below is the current proposal form the metrics team, but has serious privacy problems discussed at the bottom. There is hope that the necessary data can be gathered entirely anonymously. The information below should therefore be considered subject to change.
'''Note''': The description below is the current proposal form the metrics team.  There are concerns from some employees and community members about potentially serious privacy problems.  The metrics team is attempting to keep this page focused on the project status and specific technical implementation details.  Discussion, opposing views, and possible alternatives are encouraged on the [[Talk:MetricsDataPing|discussion page]] (also linked at the top of the page). A large portion of the original discussion and views created by [[User:BenB]] are also included [[MetricsDataPing#Privacy|below]] per his request that we not relocate them.  If there are any alternatives that meet the requirements of the project while providing a different approach that increases the perceived privacy concerns, this page will be updated to reflect them.
 
== Requirements ==
*;Enable retention analytics:Mozilla has a critical need to be able to understand the factors that cause installations of Firefox to no longer be used. The system must have some way to detect an abandoned installation.  The current implementation handles this by using a generated document ID for each submission and deleting the previous submission on the server when a new one is posted.  With this method, an abandoned installation can be detected based on the age of the last submitted document.  Retention analysis includes being able to ask and answer questions such as:
** Are abandoned installations typically new or old? Were they created and used once then abandoned or were they used actively for a period of time?
** What were the performance and stability characteristics of abandoned installations?  Were they slow or crashy?
** Did the installations have addons that potentially affect stability or performance?
** What operating systems and OS versions were abandoned installations running upon?
** How frequently were abandoned installations upgraded?  Were they running the latest stable release or an old release?
 
*;Enable reliable unique installation counting:Mozilla currently uses Active Daily Installations (ADI) as a key metric for the health of the product.  ADI is currently calculated by looking at the number of AMO Blocklist requests made each day.  However, this is not a number that can be summed over timespans larger than one day. We cannot accurately measure how many unique active installations there were over a week or month.  Code was added to the Blocklist feature at the beginning of 2011 to enable this, but a technical limitation of the core implementation of the Blocklist feature (the metrics are not conditional on successful response from the server) rendered this method unreliable.  Further changes to the Blocklist system that were not relevant to the Blocklist feature itself were prohibited by the module owners.
 
*;Consolidate metric gathering and reduce or eliminate piggy-backing metrics on other systems:The metrics team is in agreement with the reasoning behind avoiding further changes to the blocklist system.  We feel that metrics gathering should have a clear place in the code and provide reasonable and useful control to the user without having to resort to turning off unrelated features to disable it.  Further, eliminating piggy-backing has the advantage of reducing unnecessary complexity in unrelated systems.
 
*;Ensure metrics are based on a statistically representative and unbiased majority of Firefox users.:Any opt-in system is subject to self-selection bias.  Controlling for this bias and performing analysis with the purpose of optimizing for the majority of users is extremely prone to failure unless there is some unbiased source of data to use as a control.  Having MDP as this unbiased source will allow us to properly control for bias in other systems such as Test Pilot or Telemetry (without any linking of the datasets).
 
*;Provide end users who desire it with the ability to review the data being submitted and perform their own analysis locally:Tying this data to a concrete feature of Firefox through the about:metrics page is useful to the user, but it also enables us to use the currently proposed implementation in compliance with various regulatory bodies.  We are developing the about:metrics interface to allow the user to answer the following types of questions.  There are many other questions we would like to enable the user or Mozilla to answer, but the initial implementation of this project was restricted to a set of metrics that are already available through other systems such as Blocklist (with the exception of search counts).
** What data is being collected about my installation by the MDP feature?
** How much am I using this browser?
** Has performance or stability improved since I installed this latest version?
** Has adding or removing specific add-ons caused a change in the browser's performance or stability?
 
*;Provide end users with the ability to remove the data collected about their installation from our servers:This is a goal to demonstrate collecting metrics in a way that is transparent to users and provides them with ownership and control.  The current implementation requires some form of document ID to enable the user to instruct the service to remove the data associated with their installation.


= Data Elements =
= Data Elements =
131

edits

Navigation menu