Labs/Jetpack/Reboot/JEP/115

From MozillaWiki
< Labs‎ | Jetpack‎ | Reboot‎ | JEP
Jump to: navigation, search

JEP 115 - Content Frames

  • Champion: Brian Warner - warner@mozilla.com
  • Status: Under Review
  • Bug Ticket:
  • Type: API
  • Difficulty:

Proposal

Summary: Add-ons need to display UI elements inside content frames in a way that is safe, flexible, and easy to use.

Many UI functions of add-ons (including existing JEPs like Panel, Single UI Element, status-bar frames, etc) will have an API that creates a new area for displaying information. The most obvious way to provide this information is in a string of HTML, or a URL that references one (most likely pointing to a resource bundled with the add-on itself, according to JEP 106).

A very common use case for these "content frame" elements will be to incorporate information obtained from remote web servers. For example, a Twitter status display would perform GETs and POSTs to retrieve text about user status and messages, and then merge this into some layout HTML to form the content for a statusbar frame or side panel, etc.

An equally common vulnerability in existing Firefox add-ons is to allow this interpolated text more authority than mere text ought to have, such as by executing script tags from the remote data. This frequently results in the remote site, or the people who post information to it, acquiring complete control over the user's browser (e.g. http://www.net-security.org/secworld.php?id=8527, http://lwn.net/Articles/348769/, http://security-assessment.com/files/presentations/liverani_freeman_abusing_firefox_extensions_defcon17.pdf).

It must be easy for add-on developers to use data from arbitrary sources without creating such vulnerabilities. Asking developers to perform content scanning is fragile and impractical: it's fail-unsafe, not fail-safe.

On the other hand, it needs to be easy to develop rich UI components, with buttons and other active elements, such that specific functions in the add-on's code are invoked when those elements are clicked. Without this, the content frame would be doomed to merely display inert data, scarcely qualifying as a UI element.

This JEP provides an proposal for balancing these two goals.

References for current Mozilla techniques:

API Description

Each content-frame -using API will accept a "frame" property bag, which contains at least a "content" property to describe how to fill the frame. The frame is inert by default: to allow more behavior, an "allow" property must be added, on which various subproperties can be used to control what the new frame can do.

let p = Panel({ frame: { content: "<html><body>Hello World.</body></html>"
                        } });

The returned object will have a "frame" property which holds a Frame object, which holds all the same properties that were used to construct the content frame. Many properties can be changed after construction.

The returned object will have a way to get access to the DOM of the new frame's contents. This DOM can be manipulated to change the contents of the frame.

The "content" property can either be a string of HTML, or a URL instance to reference content coming from somewhere else. JEP 106 provides a way to get URLs that reference content that is bundled into the add-on.

Examples

Inert Frames

The simplest sort of Panel creation will look like this:

let p = Panel({ frame: { content: "<html><body>Hello World.</body></html>"
                        } });

This creates a new Panel with static contents. No scripts of any sort are run, nor can any be attached later (via event-handler attributes). Event handlers will be ignored. All URLs inside the content (such as in <img> or <style> tags) must begin with a slash and reference content that is bundled with the add-on.

Since no remote GETs or POSTs can be caused directly by content inside the frame, the question of how such access should be controlled (and credentials included) is moot. Since no javascript can run inside the frame, the question of how it should communicate with its creator is also moot.

To use content that comes bundled with the add-on, create the panel with a URL like this:

let data = require("self").data;
let p = Panel({ frame: { content: data.url("mypanel.html")
                        } });

As above, this would not allow any javascript to be run.

To use content that comes from a remote server, create the panel with an HTTP or HTTPS URL:

let URL = require("url").URL
let p = Panel({ frame: { content: URL("http://example.com/0wnmypanel.html")
                        } });

This content will, like regular HTML coming from a web server, be able to cause GETs to arbitrary servers via IMG and other tags.

Active Frames

We draw a distinction between the outer-level code which creates a frame (the "creator code") and the confined inner code that lives inside the frame ("content code").

To allow active content code, the Panel (or other content frames) will be constructed with an explicit flag to allow such things:

let p = Panel({ frame: { content: data.url("mypanel.html"),
                         allow: { script: true }
                        } });

The behavior enabled by allow: { script: true } is as follows:

  • <script> tags are executed
  • onClick and other "event attributes" are executed
  • for frames created with http/https URLs, same-domain XHR is allowed. (for frames created with bundled content, using data.url(), XHR is disallowed)
  • frames can use <img> tags (and others) to cause GET requests to be sent to arbitrary servers
  • event attributes can be added to content elements by adding them to DOM nodes.
  • cookies?? credentials?? TBD. I think these should not be sent.

The default behavior is to disable all of these behaviors. If there is no "allow" property, or there is no "script" property, or if the "script" property has a value of False, these behaviors will be disabled. A future version of this document will specify other values for the "script" property. For example, we may retroactively define {script: true} to be expanded into:

{script: { scriptTags: true,
           eventAttributes: true,
           sameDomainXHR: true,
           externalGET: true,
           setEventAttributesViaDOM: true
         } }

and thus allow other combinations of these properties to be expressed.

Content Manipulation

Once the frame is created, the DOM can be manipulated as a normal page:

let p = Panel({ frame: { content: data.url("mypanel.html")
                        } });
p.document.getElementById("progress").text = "0%";

If the frame was created with the ability to set event attributes via the DOM (which is implied by allow: { script: true }), then this same technique can be used to attach event handlers:

function go(event) { ... }
p.document.getElementById("button").onClick = go;

When using this approach, care must be taken to avoid granting too much authority to the frame's code. The content code will be able to invoke any functions that are attached in this fashion.

Best Practices

In general, it is best to assume that any code running inside a content frame is doomed to be compromised. This is doubly true when the code is coming from a remote server (rather than from a resource bundled inside the add-on), or when information from remote servers are injected or mixed into the code. Keep the bulk of the code outside the frame. Use a minimum of event handlers, and only allow them to notify the parent code about changes (rather than having them perform significant processing).

When possible, add-ons should use the inert form (with no allow: property), since these are always safe. If using the active form,

Sample use cases:

Constant Inert 
a help panel, containing only HTML which is bundled with the add-on, not incorporating any variable data:
let p = Panel({ frame: { content: data.url("help.html")
                        } });
Variable Inert 
a weather display, containing a temperature retrieved from an external site:
var temp = get_temperature_with_XHR();
var html = "<html><body>Temperature is: " + temp + "</body></html>";
let p = Panel({ frame: { content: html } });
Remote Inert 
a weather display, using HTML provided by a remote server, perhaps via their mobile-formatted URL. This ignores any script tags provided by the remote site.
let u = "http://m.weather.example.com/" + cityname;
let p = Panel({ frame: { content: URL(url) } });
Remote Active 
a weather display, using HTML provided by a remote server, including any scripts they provide, perhaps to enable flashy scrolling widgets or animated raindrops
let u = "http://m.weather.example.com/" + cityname;
let p = Panel({ frame: { content: URL(url), allow: { script: true } } });

Note that the foreign code inside the panel will be able to perform arbitrary interaction with m.weather.example.com and reveal information to arbitrary other sites via GETs. However, it will not have any access to the add-on code which created the panel.

Remote Active with Extra Behavior 
a weather display, with a button to refresh the contents
let u = "http://m.weather.example.com/" + cityname;
let p = Panel({ frame: { content: URL(url), allow: { script: true } } });
function refresh(event) { p.frame.reload() }
button = p.document.createTextNode("Refresh");
button.onClick = refresh
p.document.appendChild(button)

Note that, in addition to the abilities described above, in this case the foreign code will be able to invoke the refresh method as much as it likes, with arbitrary event objects. The refresh method must tolerate malicious invocation.

Future Expansion

We leave room in the allows: property to use new mechanisms in the future. In particular, rather than encouraging creator-code -to- content-code communication by using DOM manipulation to attach event handler functions, it may be easier to secure a bidirectional "JSON pipe". This would be enabled by providing allows: {control: PipeObject}, in which the content code gets an object it can write JSON-serializable objects into, and the creator code provides an object which will receive these objects.

The content's code will have an object named "control" in its global environment. By invoking "control.send(msg)", where "msg" is any JSON-serializable object, a message can be sent to the creator code. By attaching a handler with "control.subscribe(handler)", the content code can receive messages from the creator code. All objects passed through this interface shall be pass-by-copy and asynchronous. We expect conventions and libraries to be developed to implement basic RPC and event-passing mechanisms via this pipe. UI actions, such as clicking a button, should use this pipe to send a message to the creator code for subsequent processing.

The content code will have no other access to creator code, or to content code in any other frame, except for through the JSON pipe. This allows the creator code to manage frames independently in isolation, unless it decides to allow them to communicate.

Outstanding Questions

It would be nice to require that JS code used in content frames conforms to the rules of ES5 Strict mode.

Regardless of what URL was used to populate the frame, the code within it shall have no communication or influence over other frames (even if they were populated with the same or a related URL). The content URL shall only be used to provide the content HTML, nothing else.

Credentials: no HTTP requests caused by the content frame shall include any cookies or HTTP-auth credentials. No special authority is derived from the content URL.

If cheaper/faster safe-cooperation mechanisms are developed later (such as exposing a frozen object to the content code instead of the JSON pipe), those can be added without affecting existing code, by passing new properties into the creation API, and making new objects available in the content code's namespace.

If new authorities are defined and made available to content code, they will be enabled by passing new properties into the creation API. For example, to allow content code to use credentials like cookies, a specific cookie-access-granting object must be passed in through the creation API.


Can content code use require() to import code? suggestion: no. allow just enough code inside the frame to enable UI actions, and put the majority of the code up in the creator. OTOH, it will make life easier to allow non-authority-bearing libraries to be imported anywhere that code can be run: this will encourage the creation of e.g. RPC libraries to cross the JSON-pipe boundary, and UI libraries to implement common widgets.

Is there any point to creating a content frame with a URL that points to an external web page? Maybe as a starting point we should only allow bundled resources, using JEP106 data.url() URLs. This would also address the question of what to use as the base for interpreting relative URLs.

Other issues:

  • if the content HTML can use IMG tags or other resource-fetch-causing elements, and we allow remote-site GETs for those resources, what authorization credentials (cookies and HTTP auth) should be included, if any? (by default, credentials should NOT be included)
  • how should relative URLs inside the content HTML be interpreted?
  • if we allow javascript to run inside the frame, what authorities should it have? (by default, it should have none)
  • how should javascript inside the frame communicate with the top-level code which created it?
  • implementation will depend upon platform details. These content frames are most likely to be implemented with Chrome IFrames (?), which have specific rules regarding how isolated they can be.
  • we could use content: HTML or url: URL to distinguish between content-providing mechanisms, instead of a URL marker class
  • we need to define the context/origin of each frame, to explain how relative URLs are interpreted, and how same-origin works, etc. I think that raw HTML content and URLs that begin with slash should both be treated as being "local", with all contents coming from the bundled resources, and deny access to external resources.
  • similarly, I think that frames loaded from external URLs should not get access to local bundled resources, which can be accomplished by saying that relative URLs are interpreted relative to the external URL, and that there are no other URLs which reference bundled resources.
  • David-Sarah Hopwood suggests splitting the authorities into two pieces: those that allow scripting of the frame content, and those that allow external network access. This is probably easier to implement than changing the way that the JS is handled.

Use Cases

Any add-on which wants to display information to the user that is more complicated than a single icon or menu item.