User:Sidstamm/Notes July 2013 PETS

From MozillaWiki
Jump to: navigation, search

These are crappy notes, and far from complete, but are some of the main takeaways for me.


11am - Crowdsourcing paper

Mobile Application Evaluation Using Automation and Crowdsourcing, Shahriyar Amini, Jialiu Lin, Jason Hong, Janne Lindqvist and Joy Zhang

  • Support application inspection through better tools and processes. (Scalability and Transparency are important)
  • They will use crowd sourcing to gauge sensitivity of things (will the app be dealing with sensitive data or topics?)
  • Analysts inspect apps to determine privacy score.
  • Iterations of heuristics and traversing the app (uses Taintdroid to monitor io and stuff)
  • Made a tool called GORT for running app in a simulator and looking at data flows.

Steps for review: 1. Heuristics 2. traversal (GORT) 3. crowd reviews with input from 1 and 2 4. All input goes to analyst for decision + score

Pretty good diagrams in the slides.

Used mechanical turk for crowd sourcing. Showed app screenshots and asked questions about identifying information disclosure "obviousness" and comfort levels with how the app works.


12:07 - community privacy

Enhancing Tools For Community Privacy, Dennis Kafura, Tom Dehart, Manuel Perez-Quinones, Denis Gracanin and Andrea Kavanaugh

  • a condition where a group of people is able to ensure the confidentiality of the sensitive information they share
    • keeping a secret in the group
    • may not be info about anyone in the group
    • think internal affairs audits.
  • This helps with "disclosures" and "inadvertent insiders"

They tag data, and tagged artifacts are only visible to people in the community.

Did a user study to see if a custom mail client that can use these tags would be usable.

  • confusion about the meaning of tags
  • peoples' mental models of tags didn't jive with this use of them.
  • mental model clash with email distro lists
  • Timeliness (agreements on use outside the group) wasn't fast enough.

Demoed CMAIL email system.


  • How else can we present tags, communities and exceptions? Things that make more mental model sense.
  • How can we make this work across other types of data and other systems that aren't built for this?


Anonymouth Revamped: Getting Closer to Stylometric Anonymity, Andrew W.E. McDonald, Jeffrey Ulman, Marc Barrowclift and Rachel Greenstadt

hiding your writing style

Stylometry - measuring writing style attribute. You can analyze text and figure out who wrote it.

Hard to disown writings.


  • Machine translation (EN->DE->JP->EN) does too little or too much. (Not enough mixing or too much to confuse the writing.)
  • People had difficulty imitating writers in a study.
  • You can obfuscate, but the quality of the texts is low.

Anonymouth: Java program using JStylo that uses machine learning to introduce style changes. Proposes change suggestions much like spell check.

Shows an anonymity bar so you know how much "anonymous" you've put in.

Takes a long time to process an input (with selected corpus of anonymity set author documents).


Please Fix Your Interfaces: The Ease of Basic Usability, Sumeet Gujrati and Eugene Vasserman

Bunch of complaints about UIs

PETS 2013


Efficient E-cash in Practice: NFC-based Payments for Intelligent Transportation Systems Gesine Hinterwalder, Christian T. Zenger, Foteini Baldimtsi, Anna Lysyanskaya, Christof Paar and Wayne Burleson

  • must be Low Cost (tokens, paper tickets)
  • payments must be quick (few 100ms to pass through gates)
  • Attributes at registration (tied to the payment mechanism) for audience measurement and stuff (not auto-transmitted)
  • Users can select and reveal pre-committed attributes at time of use.

Payments verified offline, and uses ECDHKeyAgreement in BlackBerry API.

Based on ECC

This doesn't work with multi pass -- must purchase one-time passes each time to keep anonymity in the use of the transport system.

Also no concept of transfers in this.

Lorrie's keynote

Privacy Notice and Choice in Practice Lorrie Faith Cranor

Showed some quilts -- we have a patchwork/quilt of privacy laws in the US... nothing holistic, but state-based, city-based, and hodgepodge.

Status quo: notice + choice. Wall of text + opt-out of some stuff. Theory is that people will make informed choices.

Global privacy enforcement (go read policies and point out lies). Privacy sweeps. These aren't scalable.

And nobody wants to read privacy policies.

What is the cost of users reading all privacy policies? How much time would it take?

  • Time = 244 hours/yr
  • Cost = $3534 /yr
  • National pop cost = $781 Billion

Privacy icons! One way to make this cost lower. Tried simpler icons with some words (since they needed words for understanding) But then word choice matters a lot.

Nutrition label! Brief summaries of the policies.

P.G. Kelley, L.F. Cranor, and N. Sadeh. Privacy as part of the app decision-making process. CHI 2003. -- made it easy to help find apps based on privacy facts (including permissions and ratings).

P3P! She wrote the O'Reilly book on it. Adopted in 2002 as a spec, but lacks incentive for adoption. IE blocks third party cookies without P3P compact policies. Even crappy p3p policies.

Do Not Track. Simpler P3P. But no understanding what Tracking means.

Lots of tools: browser privacy settings, add-ons, opt-out cookies, DAA AdChoices.

"Smart, Useful, Scary, Creepy" paper: wanted to identify what kind of awareness exists about OBA and how peoples' mental models correspond with notice and choice. 48 non-technologists, interested in privacy. Part-way through, showed the WSJ video to inform people (since nobody knew anything about OBA). Most people were unaware that ads were targeted based on multiple sites (not just one site or contextual). Nobody recognized the OBA icon. People weren't sure how to find and control cookie settings. People trust more popular brands (Google, Microsoft) and are skeptical of other brands -- brand awareness is king when transparency exists. No idea how to effectively exercise choice.

Next, same people, tested efficacy of tools. Found lots of bad UI. Ghostery is hard to understand. TPLs UI is hard. Do people get what the OBA icons indicate? No.

"Are they actually any different? Comparing thousands of financial institutions' privacy policies" -- why do they all look the same? Gramm-Leach Bliley Act, mandated annual privacy disclosures. Standardized notice (2009) invented by safe harbor. Two pages. They parsed 'em. A lot of banks actually don't share. The 100 largest do. Top cc companies are the worst. Banks are not all the same. FCRA requires opt-outs, but some institutions don't. Adoption happens when there are incentives.

How effective is privacy notice and choice? Not very.

How do we make it better?

  • Incentives for adoption.
  • Enforcement (legal and technical)
  • Baseline requirements - minimum levels of protection
  • standardization of notices
  • machine-readable notices
  • Reduce ambiguity
  • Links to full disclosures
  • Comparison tools


User study asking whether people knew about OBA - were users distressed to the point that HS or IRB got mad? Not that distressed, just uneasy. You can tell the IRB that you're educating the participants.

What's your opinion on opt-in v. opt-out? Opt-in sounds good from a privacy perspective, but they're not that different when they're effective. Both depend wholly on implementation and presentation.

Will your study participants aware about the privacy harms? Most people haven't spent time thinking about privacy. People don't have a handle on what happens when your privacy is invaded. Some people have their own particular boogieman.

Did you notice differences across demographics in your user studies? Not huge differences. Samples are small in labs, but the online studies don't show huge differences. Sometimes in age or gender. Nothing dramatic. Mainly people are concerned about different things.

Pattern that goes beyond 'this is hard' ... the incentives aren't quite right. This makes Self Reg seem iffy. What might work? wish I had the answers.  :( The FTC has kept things from getting really bad by threatening lawmaking, but it's not helped things get a lot better. Can-kicking.

What do you think of the economics of privacy? People financially benefiting from the data have no incentives to stop collection or use. Especially in competition with others in the same market. Advantage of regulation is putting the whole industry on a level playing field.

How about apps? What about JIT notice after app installation since nobody reads things? Yeah, in particular for mobile apps.

Lots of apps include advertising, which includes tracking. Does it make sense to raise the base requirements for the advertising part of the app vs what the app does outside advertising? Maybe. Problem is that app devs don't know what the ad net's privacy policies are.

Circumventing Censorship

OSS: Using Online Scanning Services for Censorship Circumvention David Fifield, Gabi Nakibly and Dan Boneh

Lots of services make HTTP requests to carry out a task

Three ideas: 1. Many services will get a web page for you 2. It's possible to embed a lot of upstream data in HTTP requests 3. There are a variety of ways a server can cause an HTTP client to download another URL of the server's choice

Assumption: a censor can see all your traffic and blackhole some IPs (not DPI adversary) Assumption: many online scanning service (OSS) outside the firewall to be used as circumvention bridges.

Goal: use OSS to get past censor to a relay.

Analogy: long distance ad, "Bob Weadababyitsaboy" collect call steganography

Upstream Method: send payload in query string to OSS, "OSS please scan;flkjsfj"

Downstream+Upstream Method: Redirect bouncing to keep redirecting OSS between client and relay. "Ping Pong" back and forth method.

Can use nested frames/iframes in a document to trigger "redirects". Meta tags work too, or a JS to submit forms.

Tested a lot of OSS and browser for payload capacity and iteration acceptance.

Developed "packet structure" for the URL encoding.

OSS tend to know if you're doing this.

flow fingerprinting

The need for flow fingerprints to link correlated network flows Amir Houmansadr and Nikita Borisov

Linking flows using communication patterns (can't use contents)

Active adversary can tag flow patterns. (This is the threat model).

A flow watermark is a tag that has a single bit "flow contains this tag". A flow fingerprint has multiple bits (ID of tagger, location of tagger, etc)

Some analysis about how their tagging size affects detectability and reliability of the fingerprinting system.


Panel: PETS Publishing Matt Wright, Claudia Diaz and Arvind Narayanan (Moderator: Aaron Johnson)

(Discussions about open access)

profiling in social nets

How Much is too Much? Leveraging Ads Audience Estimation to Evaluate Public Profile Uniqueness Terence Chen, Abdelberi Chaabane, Pierre Ugo Tournoux, Mohamed Ali Kaafar and Roksana Boreli

Revisited Latanya Sweeney's study about uniquely identifying people with DOB, zip code and name, but used an online social net (OSN)

They attempted to compute the uniqueness of OSN public profiles. Can't crawl Facebook, though! Can't estimate by unbiased sample and project because it's tough to estimate the profiles not in the sample.

Rely on ad platforms to compute uniqueness of public OSN profiles.

Data set: 100 million user IDs in Facebook public profiles. Sampled 500k of them.

Defined information surprisal (IS): IS(u^A) = -log2(P(u^A)) This is the probability of a user having an OSN profile with attributes set A.

Used the Facebook ad audience platform since it's the full data set. This also includes private profiles. But computed the difference (factored out by the probability a profile is private).

[Lost track, lots of things about "we don't know if Facebook does x"]

city, gender and age only uniquely identify 18% of users on Facebook.

german ID card stuff

On the Acceptance of Privacy-Preserving Authentication Technology: The Curious Case of National Identity Cards Marian Harbach, Sascha Fahl, Matthias Rieger and Matthew Smith

German government deployed an auth scheme suitable for large scale use on the internet. By 2020, all Germans.

Auth is deployed and works.

Privacy-preserving (no correlation across service and anonymous access to services).

more privacy and fewer passwords.

Main purpose is to make eGovernment processes better. Centralize things, make some gov't processes online.

Encryption, user consent for data access, authentication of partners (not just users), limited data for the transaction, inability to monitor cardholder, card revocation, anonymous authentication.

Uses card-verifiable certs, extended access control, etc.

Requires card reader, client software (free) and 6 digit pin.

Formed focus groups to figure out why adoption is not wide. (limited to students)

  • Most were confident with the scheme.
  • They worked hard to hide username and passwords
  • They physically hid security tokens used for banking.
  • "I think I'm safe, because someone would tell me if I wasn't."
  • Password recovery can fail, people worry about this.
  • No clear added value or motivation to adopt eID
  • Need expert/external opinion to convince them
  • Too complicated.
  • "I'd rather have everything in my own hands"
  • Cannot be sure of what info is actually transmitted (are they spying?)
  • ID card may not be on hand when needed
  • Pain to fetch card reader
  • Card readers are too expensive
  • "a very personal document" is not suitable for "playing around on the internet"
  • Fear of worst-case consequences
  • Too much bad publicity, not enough positive publicity