Papers:Sending the Right Signals

This document is currently in draft.
Please do not edit this page without permission. Your feedback and comments are welcomed on the discussion page.

This is Mozilla's submission for the upcoming W3C Workshop on Transparency and Usability of Web Authentication. The purpose of the Workshop is

"to identify steps W3C can take to improve Web Security from the user-facing end of the spectrum: Practical security online often fails because users can be induced to make decisions that jeopardize their security and privacy, based on a lack of working authentication of Web sites' identities (phishing). We will look at technologies that can support Web users to better assess the trustworthiness and identity of sites with which they deal."

Jane, IRL

Jane is traveling, and finds herself in an unfamiliar area. She turns a corner and sees a bank, a corner store, and a taxi. She's hungry and wants to get back to her hotel, so she enters the bank, uses her ATM card to withdraw some money, walks to the corner store and gets a local snack and drink, and finally hops in the cab and heads off.

How did Jane know that the bank could be trusted? How could she be sure that the food she was about to buy wouldn't make her sick? What convinced her that the taxi driver was on the level?

In the physical world, there are a variety of signals that Jane can use to establish a sense of trust. Some of these signals are physical in form such as the architecture of the buildings, the cleanliness of the taxi, and freshness seals on packages. Other signals are entirely conceptual such as brand recognition. In all cases however, Jane's assessment of trust is based on levels of familiarity. If Jane recognizes the name of the bank, she will likely trust it completely. Jane may also decide to trust the bank if she recognizes the pattern of the name of the bank (i.e.: First National Bank of Whereverland) or if its physical characteristics match her mental image of a bank. There is a chance that Jane will be fooled, but we tend to be very effective at pattern matching, and even small inconsistencies would very likely raise suspicion.

Jane, Online

Jane returns home from traveling, and decides to go online and plan her next trip. After using a search engine to look for recommendations, she finds herself on an unfamiliar message board. She sees a link to a website that builds custom vacation packages. Jane likes this idea, and follows the link, submits her preferences and identification information, and charges her next trip to her credit card.

This time, when Jane had to make her assessment of trust, she had a similar set of signals to choose from. The name of the website may be a recognizable brand, or have closely matched a pattern that was familiar to Jane. The look and feel of the website may also matched Jane's expectation of what a professional website looks like.

Sadly, however, it is entirely possible that Jane had stumbled upon a malicious website which was impersonating a legitimate travel business. Jane may have just provided that malicious user with her personal identification and credit information.

Signals, IRL vs. Online

The physical world is obviously different from the online world. What is less obvious is that we all carry a set of expectations and experiences -- a "default philosophy" -- based on our real world experiences, and we interpret everything through this philosophy, including our "virtual world" experiences online (for more on this idea, see Small Pieces, Loosely Joined by David Weinberger). There are some fundamental differences between signals available to an individual in the physical and online worlds, however, and these differences are what make internet users so vulnerable to attack.

  • Tangibility: Perhaps the most obvious difference is that the physical world is tangible whereas the virtual world is not. When an individual visits a location in the physical world, they can examine it directly and in many dimensions. In the virtual world we are limited to the dimensions presented to us by the software used to view the virtual objects. As a result, we experience objects in the physical world in many more dimensions than those of the virtual. The additional dimensions (such as weight, smell, depth, tactile sensation) all provide contextual signals which are absent from objects in the virtual world, and which can contribute to one's evaluation of trust.
  • Cost of Impersonation: Closely related to tangibility is the cost of impersonation. Because physical world objects must be convincing in so many dimensions, and because the human brain is so adept at recognizing patterns and exceptions to patterns, the task of impersonating an entity in the real world is is both complex and costly. Virtual world objects, on the other hand, are easy to impersonate as they exist in far fewer dimensions, and can be duplicated with almost no cost or complexity.
  • Familiarity: As individuals, we have existed in the physical world for our entire lives. As a civilization, we have existed in the physical world for hundreds of years. This familiarity yields expectations of how objects in the physical world will look, feel and behave. The virtual world, on the other hand, is new and unfamiliar to many of its users. As a result, there is less of an expectation of how an entity should appear in the virtual world. While it is true that many virtual entities such as banks have patterned themselves after one another (i.e.: similar features, navigation structure and use of a prominent client login area) these patterns are young and malleable. The physical world, on the other hand, has well established patterns that result in a expectation of what an entity such as a bank would look like (ie: tellers, thick doors, slips of paper, a security guard.)
  • Consistency: Signals from the physical world are consistently presented to us through our own senses. We cannot modify our senses, merely interpret the signals that we receive through them. In the virtual world, however, there is an intermediary between the object and our senses. The software used to present a virtual object presents signals about that object in an arbitrary fashion. As a result, signals from the virtual world are not necessarily consistently presented, but are instead dependent on the tool with which we are viewing the virtual object.

Evaluations of trust in the physical world are assisted by the fact that entities are tangible, costly to impersonate, familiar and consistently interpreted by our own senses. In the virtual world, however, we are hindered by the fact that entities are intangible, easily impersonated, unfamiliar and interpreted by clients that are not necessarily consistent.

Any solution that aims to simplify the task of evaluating trustworthiness in the virtual world therefore needs to address these limitations on our abilities. The virtual world is however filled with objects that are by definition intangible, by design easily impersonated, and by immaturity unfamiliar. The only factor within our control is consistency of how signals are presented to the user.

Available Online Signals for Trust

As established above, in the online world an individual must make a judgement of trustworthiness based on the signals available about a virtual object. In addition to signals such as name recognition or look and feel, the online world currently provide two additional signals that we can use to assist users in evaluating trustworthiness:

  • Encryption lets us comment on the likelihood that the information has been intercepted.
  • Certificates allow us to comment on the authenticity of an entity's claim to its identity as asserted by a certificate authority (CA).
  • Recommendations about the trustworthiness of an entity can be made by an organization or network.

Most web browsers available to users today provide some mechanism to indicate these signals to users. Unfortunately, each browser interprets and represents the signals slightly differently:

  • Inconsistent icons: Internet Explorer 7, Mozilla, Safari and Opera all use a lock icon to indicate when security signals are present, but Internet Explorer 7 also uses red and yellow shields to indicate when an entity is thought to be suspicious or malicious.
  • Inconsistent colors: Mozilla and Opera use a yellow background to indicate when encryption and certificate signals are present. Internet Explorer 7 uses green to indicate a positive interpretation of trust, yellow to indicate suspicion and red to indicate a negative interpretation of trust. Safari doesn't use color to communicate either of these signals.
  • Inconsistent location: Internet Explorer 7 and Opera display the entity's claimed identity in the URL bar, and allow an individual to investigate the name of the CA by clicking on this information. Mozilla displays the entity's claimed identity in the bottom right corner of the browser, and the name of the CA is shown when the user hovers over the lock icon in the URL bar. Safari displays both the entity's claimed identity and the name of the CA in dialog presented when the user clicks on the lock icon in the upper right corner of the screen.
  • Inconsistent terminology: Internet Explorer 7 says that an entity is "identified by" a CA. Opera calls the CA the "Certificate issuer" and Mozilla says that an entity is "Signed by" a CA. Safari tells users that a given certificate has been "issued by" a CA. All browsers refer to "encryption", but present the encryption standards differently.
  • Inconsistent signals: Some add on tools for popular browsers create a network of trust that results in a recommendation signal being presented to a user. These signals are not always present, and not available in all web browsers.

Position on Usability of Website Authentication

The technologies and frameworks that exist in the virtual world for providing signals about website authentication are currently in flux. The next generation of web browsers will leverage whichever of these signals are workable at the time of their release. Perhaps that will be SSL/PKI, or SSL/PKI with multi-tiered certificates, or perhaps it will be a network of trust or some other heuristic measure based on meta-browsing habits. These technologies should continue to be allowed to grow in ways that address the hard questions of implementing security infrastructures.

It is our position, however, that tools which are used to connect individuals to entities in the virtual world should be consistent in the way they present the available signals to users. This consistency will allow a user to move from browser to browser without having to re-learn how to interpret signals from the browser on the trustworthiness of an entity. Consistency also shapes user expectations, and helps breed familiarity.

Consistency also promotes clarity, since users can focus on understanding a single concept instead of having to interpret multiple expressions of a single concept. Clarity is also enhanced by avoiding technology-centric terms, and by focusing on assisting the user with the single task of making a judgement regarding the trustworthiness of a virtual entity.

Organizations like the W3C often focus on ensuring that vendors consistently observe a technology standard. The resources and processes of these organizations should also be used to promote standards of expressing these signals to users. An example expression of our current technologies might be:

  • A connection to an entity should be said to be secure when the connection is encrypted and it can be reasonably assured that communication is restricted to the user and the entity.
  • If a connection is signed, then the entity should be said to be identified with some name, by some CA.
  • If a signal exists (through FoaF networks, whitelists, preferred CA signatories, etc) that asserts a site to be trustworthy or untrustworthy, then the entity should be said to be recommended or suspected by some organization.

Even the above example, however, limits us to expressions based on our current technology. Ideally the standards for the expression of security signals would be general in nature, allowing for the user to be insulated from the requirement to understand the underlying technology used to generate those signals.