Privacy/Reviews:How It Works

From MozillaWiki
Jump to: navigation, search

Notes for Privacy Review training (this is just for heavy privacy reviews -- also telemetry and bug-borne reviews)

Much of this is the same as security reviews, just more socially/psychologically oriented rather than principled software operation.

Four sections: 1. High-level inputs/outputs and goals 2. Data Flow Modeling 3. Risk Analysis 4. Follow-up

High-level inputs/outputs and goals

Inputs:

  • Data flow diagram
  • Data flow element tables

The Inputs are valuable because: 1. The data flow of the system is clear, obvious, and understandable 2. An inventory of the collected/shared data is made explicit 3. Data minimization techniques can be made easier

Outputs:

  • Perceived risks
  • Risk minimization requirements
  • Risk minimization recommendations
  • Risk already minimal rationale

These outputs are valuable because: 1. They cause the team to think through privacy in design 2. The risk analysis and outcomes document our decision-making 3. We have rationale for how the project fulfills our privacy principles

Goals

  • Align our major work units with Mozilla's Privacy Operating Principles
    • No Surprises -- Will users reasonably expect what we do? Are we honest?
    • Limited Data -- Are we collecting / sharing only the data necessary for the user benefit?
    • Real Choices -- Can users control to what extent we use their data without completely opting-out (where possible)? Are we leveraging "shiny" to get users to commit user data when it's not needed for "shiny"?
    • Sensible Defaults -- Do we anticipate the majority use case and default to that where possible (or most conservative data practices)?
    • Trusted Third Parties -- (Mostly Policy) Do we extend these expectations to our third-party partners?
  • Provide documentation of our thought process and rationale for why the feature is "safe".
    • Valuable for responses to press inquiries
    • Helps users who want to understand features' data implications without getting super-technical. (Clear statement of user benefit for collected data.)
    • Such documentation allows easy replication of system (LEAD BY EXAMPLE)
    • More evidence of our transparent/open culture.
  • Provide more documentation of feature
    • Helps new developers and integration teams understand APIs and feature operation
    • More developer documentation for third-party integration and partnering

Data-Flow Modeling

What makes up a data flow diagram? 1. Components 2. APIs

The components are the logical segments of a project controlled by either different parties or different development teams. They are not necessarily each interface. Often interfaces are used within components to make parts reusable.

Components are roughly Domains Of Control. Not only is a component a unit of operation, but also often includes a data store/database.

APIs are the messages passed between endpoints, including the type of data carried along as payload. These expose data sharing capabilities and collection flows.

Data Flow Diagram: Components are nodes, APIs are edges. Sometimes the flows are one-directional, sometimes they're multidirectional.

For each component, project teams document in tabular form the flow of data into the component and flow of data out of the component -- these flow tables are one per other component with which this one interacts. Some is redundant.

EXAMPLE: Three components: X, Y, Z. For component X:

Communication with Y

Direction message data
in message1 associated data
out return from message1 associated data

Communication with Z

Direction message data
in message3 associated data
out message4 associated data

Risk Analysis

There are two phases to Risk Analysis: 1. Initial privacy champion analysis (priming) 2. Community review

On the surface this looks like the project is hung on the wall and privacy nuts throw rocks at it. This is not really what happens; the analysts are using their experiences, intuition, and threat modeling to identify data practices that work against our privacy principles.

Phase 1: Privacy Champion Analysis

In the first phase, the privacy champion documents how the project aligns with each privacy principle (or risks of violating the principles). He creates risk minimization recommendations and where the feature operates against our privacy principles, documents the tweaks as requirements. Rarely these recommendations should significantly change the project's feature set, but regularly they suggest alternate approaches or architectures that accomplish the same outcome (user benefit) with lower risk.

Once the privacy principle alignment recommendations and requirements have been created, the privacy champion identifies any additional user data risk that do not correspond to a principle. Often this is based on intuition -- these are data practices that some users may find "creepy". Here, the privacy champion recommends ways to minimize data use without the sacrifice of functionality or he documents ways to gain user consent for the given operation.

Through this initial risk analysis, the privacy champion ensures that for all data collected, there is some benefit or functionality made available to users. Where it's not already clear, he asks the project team to explain and include in the review what user benefit is enabled by the collected or shared data.

Phase 2: Community Feedback

Once the privacy champion has has identified and documented risks (and resolutions) in the project, the review is submitted to a discussion forum for public discussion. Rarely can an individual identify all the risks and suggest the best minimization techniques; this is where the power of the community is engaged. Privacy reviews are performed on a very wide variety of engineering efforts, and domain experts lurking in the community forums will step in when given an opportunity to offer their expertise to the project.

After a period of feedback, a project manager or privacy champion distills the discussion into identified risks and actions the project team must or may take to minimize the risks.

Manifestation

In order to clarify output and successful inclusion from risk analysis, risks and minimization actions are documented in the following format:

The Risk is ... (what is the problem)

Recommendation: what the team SHOULD do to make it better

Requirement: what the team MUST do to avoid negative backlash or escalation

{{ResolutionBox|{{<status>|summary}}}}

Which looks like:

Resolution:
[RESOLVED] summary

<status> is one of:

  • "new" - Risk analysis complete, team not yet received the feedback. Summary should contain specific actions necessary for resolution.
  • "ok" - Team is working on it. Summary should link to bugs or contain actions being taken that satisfy the requirements.
  • "risk" - This risk is still present and at risk of shipping. Summary should contain the state of the feature and what is left to do.
  • "resolved" - The risk is minimized. Summary should contain links to closed bugs or resolutions taken.
  • "unresolved" - Project team has concluded to perform no action to minimize this risk. Summary should be link to where decision was made.

Follow-Up

After risk analysis, the results are documented (in the format shown above) to provide the teams clear action items. Either the privacy champion or project manager makes the project team aware of the action items (files bugs, gets team to file bugs, tracks progress on actions).

The privacy policy (and legal if necessary) teams are connected to the project team to ensure that as the privacy review is resolved, any relevant privacy policies get updated to reflect our products' changes in operations.

When actions are complete, the project team updates the wiki to reflect their progress on the tasks. If they're comfortable with it, they also update the resolution boxes and follow-up tasks section at the bottom of the page.

Once all the risks have been resolved in the document (all resolution boxes are "resolved"), the privacy review itself is marked as resolved and no further action is necessary. If any resolution boxes are marked "unresolved" at any time, the entire privacy review is marked "unresolved" and no further action is necessary unless the project team would like to resolve the risks.