WebExtensions/Implementing APIs out-of-process

From MozillaWiki
Jump to: navigation, search

This document is intended for WebExtension developers (as in implementers in Firefox, not addon developers) and gives an overview of how new WebExtensions APIs should be implemented in a way that is compatible with out-of-process addons.

The reader is assumed to be familiar with the layout of WebExtensions code in the tree, explained at https://wiki.mozilla.org/WebExtensions/Hacking#Code_layout

Process types

We distinguish the following process types.

  • main - The main process. Hosts the browser UI, privileged chrome pages and tab browsers.
  • addon - Hosts an addon's background page, popup, extension tabs, ...
  • content - Hosts web content (from the internet).

There is only one browser process, and any number of addon and content processes. In practice, they do not necessarily run in separate processes:

  • At first there was only one process.
  • With e10s, one process was added for content (bugzil.la/e10s).
  • With e10s-multi, multiple processes may be used for content (bugzil.la/e10s-multi).
  • With webext-oop, addon code runs in a separate process (bugzil.la/webext-oop).
  • In the far future, addon code may also run in multiple processes.

WebExtensions code should be compatible with these process models. This means that code should be asynchronous where reasonably possible, especially when logic involves the use of information from different processes. The canonical representation of addons is stored in the main process.

Execution environments

Scripts in the following contexts can make use of addon APIs:

  • Extension tabs, background page, pageAction/browserAction popup.
  • Content scripts, extension frames in non-extension tabs.
  • PAC scripts (bugzil.la/1295807).
  • Devtools pages (bugzil.la/1291737).

For each of these, a subclass of BaseContext should be created, each with a unique envType. This envType is used for determining which scripts should be *activated* in the context. E.g. the next ext-script registers an API with the given namespace for the "content_child" context.

// an example of an ext-*.js script:
extensions.registerSchemaAPI("myNamespace", "content_child", (context) => {
  return {
    myNamespace: {
      syncMethod() { return "some return value"; },
      asyncMethod() { return Promise.reject({message: "Some error"}); },
    },
  };
});

extensions in the ext-script is an instance of the SchemaAPIManager. This class is responsible for maintaining the sandbox containing the ext-scripts. There is only one instance per process type. ext-scripts run in their own scope but they can share state via the global object. This design prevents inadvertent leakage of global state between supposedly independent ext-scripts, which reduces the risk of divergent behavior if the independent scripts move to different processes.

envType

The first implemented values for envType are addon_parent, addon_child, content_parent and content_child. The envTypes with the "child" suffix should be used by addon APIs that run in the same process as the caller, usually because the method synchronously returns a value or because of dependencies on DOM APIs and such. The "parent" suffix refers to APIs that run in the main process.

When adding a new envType, check in how many process types the API may run. There is always at least one type, namely the process containing the addon code. If the addon code runs in the main process, odds are that you only need one envType. If the addon code runs in another process, and your API implementation requires chrome-specific functionality, you probably need two envTypes.


BaseContext, ExtensionContext, ProxyContext and subclasses

In the above snippet, the context parameter is an instance of BaseContext (or a subclass). This context is created whenever an addon runtime is created (e.g. a background page or content script).

BaseContext contains common functionality that can be relied upon by all ext-scripts, whereas some other functionality is only available in some specific contexts. E.g. context.contentWindow is only available to the contexts that are close to the content, such as the contexts with envType "content_child" or "addon_child".

An example of content_child is ExtensionContext in ExtensionContent.jsm. This context is created together with a ProxyContext of envType content_parent in the main process. Together they form the implementation of content script addon APIs.

New classes should derive from BaseContext, unless you are absolutely certain that your use case is compatible with the semantics of the superclass that you are using.


Generating APIs

The previous section explained the runtime environment of ext-scripts. This section explains how the ext-scripts are used to compose the chrome. and browser. objects that can be used by addons.

Initialization

API generation is completely based on JSON schemas, which specify the API and constraints. These schemas are registered in the category manager and loaded in the main process and shared with child processes before any addon is started.

The ext-scripts (which implement the APIs) are also declared in the category manager, and loaded via subscripts.

Both tasks are performed by SchemaAPIManager (or its subclasses), and APIs are only generated once the above steps complete. Since extensions (and its pages and content scripts) are only loaded once the schemas are loaded, and subscripts are loaded synchronously, this effectively means that any addon API can be generated when a BaseContext instance is created.

Internal API generation (untyped)

The first step towards API generation is to generate a context-specific instance of the APIs that were registered through the ext-scripts. This is done by calling the generateAPIs method of SchemaAPIManager (or on its subclasses). The generateAPIs method enumerates all registered APIs with a matching envType and deeply merges the result. So it is possible for one file to provide a partial implementation of a namespace, and complement it with another file.

The resulting object looks like the final API, but appearances are deceiving: It is is only an intermediate API, to be used by the automatically generated APIs based on the JSON schemas.

Public API generation (checked)

The chrome. or browser. objects for addons is completely generated by Schemas.inject. This method takes a destination object (e.g. Cu.createObjectIn(window)) and a wrapper object, and adds all addon APIs with parameter validation to the destination object. These generated APIs themselves do not provide any API functionality. The wrapper is the binding between the internal API (providing the actual functionality) and the generated API (which only performs validation). The wrapper has to implement the InjectionContext interface as documented in Schemas.jsm.

This interface has two points of interest: shouldInject and getImplementation.

shouldInject

The first line of eliminating API access is through (manifest) permissions. For fine-grained access control in addition to permissions, shouldInject can be used. shouldInject receives parameters to identify which function is being called, and an optional array of "allowedContexts". These "allowedContexts" can be declared in a JSON schema to annotate certain APIs and enforce access control. The next example shows how content scripts are withheld of all APIs unless enabled explicitly by adding "content" to "allowedContexts" in the JSON schema.

  shouldInject(namespace, name, allowedContexts) {
    // Do not generate content script APIs, unless explicitly allowed.
    if (this.context.envType === "content_child" &&
        !allowedContexts.includes("content")) {
      return false;
    }
    return true;
  }

With the above check, schema APIs are completely disabled unless the JSON schema specifies the context in "allowedContexts". With "permissions", schema APIs are enabled, unless the schema specifies a permission that the extension lacks.

Any API node can be annotated with "allowedContexts". To cater for the use case where a full namespace should be made available, the "defaultContexts" key can be used. Note that when "allowedContexts" is used to opt into an API (like the above example with "content"), then the namespace declaration in the JSON schema must also be annotated with the "allowedContexts" key.

Here are some examples of code that migrated from being untyped to using schemas to enforce types (note that "allowedContexts" / "defaultContexts" was initially called "restrictions" / "defaultRestrictions", but that changed in bug 1302898):

getImplementation

getImplementation is called once shouldInject has returned true for a given API method, event, property, etc. It should return an implementation of the SchemaAPIInterface. Currently, two implementations exist, LocalAPIImplementation and ProxyAPIImplementation.

LocalAPIImplementation is a thin wrapper that delegates any calls directly to the internal API. ProxyAPIImplementation implements invocation of APIs from a child process that are implemented in the main process (remote APIs).

The following wrapper can be used to generate an API that is fully implemented locally:

  getImplementation(namespace, name) {
    let pathObj = ...;  // Somehow find the API with the given name in the given namespace.
    return new LocalAPIImplementation(pathObj, name, context);
  },

To implement remote APIs (optionally mixed with local APIs), see ChildAPIManager.

Remarks

  • The set of available addon APIs is the intersection between available internal APIs and injected schema-gemerated APIs.
  • Depending on the implementation of shouldInject, it is possible that a schema API is generated while no internal API exists. The behavior of this situation is unspecified, but odds are that some exception is thrown.
  • It is also possible that an internal API is provided, but inaccessible to addons because of incomplete schema definitions. It has no real consequences, except wasted resources on generating unnecessary objects. Incidentally, this is also why registerSchemaAPI takes an envType as an argument, because it would be a huge waste to generate unused API bindings for all extension APIs in content scripts, because content scripts only use a very limited subset.


Writing new APIs

The existing ext-*.js files in the tree are good examples of API implementations. This section shows more insight in why some things are implemented in the way they are.

Naming conventions

  • File name: Historically, the API implementation in ext-*.js runs in the main process. To distinguish between code running in content processes and the main process, a new prefix was introduced (ext-c-*.js). This c initially stood for "content" (content processes), but was later generalized to "child" because of the big overlap in the implementation of APIs for content processes (content scripts) and addon processes (privileged addon APIs).

Cross-process API implementations

There are multiple ways to implement APIs:

  • In the child only (e.g. ext-i18n.js)
  • In the parent process only (e.g. ext-history.js)
  • In the parent and child process where some methods run in the main process and others in the child process (e.g. ext-tabs.js and ext-c-tabs.js)
  • In the parent and child process where a method is partially implemented in the child process and partially in the main process (e.g. ext-storage.js and ext-c-storage.js)

If a method is not implemented in the child process, the ChildAPIManager will automatically route the API call to the parent process. Sometimes the method's arguments or return value is so special that the default behavior is insufficient, then you can choose to implement (part of) the implementation in the child.

The ChildAPIManager (available as context.childManager for ExtensionContexts) has the following helper methods (with JSDoc documentation at their implementations in ExtensionUtils.jsm). All of these methods require at least the API path as an argument, which is is used to find the implementation in the parent process.

  • callParentAsyncFunction - Call a method in the parent process, and resolve the promise/callback when that function returns.
  • callParentFunctionNoReturn - Call a method in the parent process without expecting a return value.
  • getParentEvent - Create a proxy for an event listener in the parent. See #Cross-process events.

Cross-process events

All event objects with the same API path and the same context (=an instance of BaseContext) share the same set of listeners. The corresponding listener in the parent process is registered once on the first addListener call (and re-used for subsequent calls). When the last listener in the child is removed via removeListener, the listener is also unregistered in the parent process. Here is an example:

let eventObj = context.getParentEvent("contextMenus.onClicked");
eventObj.addListener(listener);     // Add one listener (and register the event in the parent)
eventObj.addListener(listener);     // Listener was already registered, not added again.
eventObj.removeListener(listener);  // Remove the listener (and removes the registration from the parent).
context.getParentEvent("contextMenus.onClicked").addListener(listener); // Identical to eventObj.addListener
eventObj.addListener(otherListener);// Add another listener, re-using existing parent event registration.

Timing issues

There are subtle timing issues related to the order of execution of asynchronous APIs. If in doubt, design the API in such a way that the order of execution is not relevant. Failing to recognize the problems will may result in intermittent tests.

  • The order of running API implementations in the parent is guaranteed to match the order of API invocations in the child.
  • An API call must not unconditionally assume that the previous API call has finished. This is because the previous API call may have an asynchronous implementation.
  • There is a subtle timing issue with event registrations that is especially visible when the add-on controls the trigger of the event and registers an event in a callback. Async APIs that trigger events may trigger the events before the callback is called. This is especially important in unit tests. For instance, tabs.create creates a tab which triggers the tabs.onUpdated event. If the add-on registers calls tabs.onUpdated.addListener in the callback, then it is possible that the onUpdated event has fired before the listener registration is processed by the parent. Due to the fact that the event listeners are shared, if a (different) listener is registered before the tabs.create call, the result may be different: When the "onUpdated" event is triggered in the parent, the event is fired and (asynchronously) sent to the child. Meanwhile the callback was invoked, the event was registered and the new event listener is also invoked.