From MozillaWiki
Jump to: navigation, search

Proposal: An XPCOM Security model

An attempt at defining the goals, parameters, and suggesting an implementation for a unified security system for the XPCOM object model.

Goal: provide a security model and API for the XPCOM component model which

  • defines and minimizes the "Trusted Computing Base" of code which must be audited for security
  • has no impedance mismatches with the CAS/CLR security model;
  • doesn't require changing existing frozen interfaces, if at all possible.

Preliminary Discussion

The existing mozilla codebase can be divided in roughly two pieces, in terms of security:

  1. methods/objects that participate in the CAPS security model. These include objects/interfaces potentially available to untrusted script. I will call this secured code.
  2. methods/objects that don't know or care about the security model. This includes most of the low-level objects and interfaces of XPCOM/networking plus many high-level objects/interfaces for use by browser chrome only (e. g. the chrome registry/bookmarks and history service/whatnot). I will call this unsecured code.

The division between secured and unsecured code is not well-defined. In addition, secured or unsecured code sometimes (frequently?) calls secured code without establishing a proper security context, which causes security bugs such as the recent file-upload issue:

  1. Web-script calls document.write("<input type=\"file\" value=\"/foo/important\">"); This is a call from secured code to secured code.
  2. The secured document.write code properly detects the security hole, and calls inputelement.setvalue(""); This is also a call from secured code to secured code, however, we are running under the wrong security context (the context of the web-script, instead of system code).

The solution was to make the code call setValueInternal (which is unsecured code), therefore bypassing security checks. However, this is not a systemic solution, because we don't want to provide "internal" versions of every secured method.

There are two options that I can see to solve this problem:

  1. Make all the code in mozilla "secured code". This involves placing security checks throughout C++, which won't perform and is silly.
  2. Make explicit the division between secured code and unsecured code. Define under what circumstances secured code and unsecured code may interact.

Option 2 is the good choice. I think that, per yesterday's mozilla2.0 teleconference, we are all agreed up to this point. Before I dive into controversial territory, I want to mention a few other implementation details which I believe are non-controversial:

  1. the CAPS implementation should not depend on the xpconnect context stack. We are moving towards multi-language support and CLR/CAS, and spidermonkey may go away completely;
  2. We need some kind of extensible/pluggable support to map URI schemes and network channels to a security codebase. Right now we've hard-coded special case code for many URI schemes into CAPS (jar:, chrome), and this is not practical as embedders and extension authors continue to invent new and exciting protocol handlers, or implement "oldies" like shtml.

Dividing and Defining Secured and Unsecured Code

Assuming, then, that we need to distinguish between secure and unsecure code, how should we do it? I am going to argue that this distinction should properly be made at the interface-level, and should not depend on the implementation (classinfo). But before I do, let's look at the current situation: Current Methods

Access to interfaces on XPCOM objects is currently managed variously by xpconnect and by objects through several mechanisms:

  1. Wrapper-creation check
    1. get nsIClassInfo, check the flags for a DOM node. If it's a DOM node, you can have a wrapper.
    2. else get nsISecurityCheckedComponent and check canCreateWrapper
  2. xpconnect access checks go through nsIXPCSecurityManager CanAccess
    1. Get the security policy of the current calling code (subject)
    2. do same-origin checks if specified
    3. reject access to anonymous content
    4. check the object for nsISecurityCheckedComponent and ask the appropriate method for non-default permissions
    5. There is a method nsIXPCScriptable.checkAccess which is never called. This seems suspicious to me.
  3. Methods can perform additional customized security checks in method implementations through the scriptsecuritymanager. Example of fileinput.setValue doing its custom check.

Note: I'm a little hazy on step 1a, I thought there was more to it, like a same-origin check.

JS-implemented methods (for example, JS components or component implemented by chrome-origin script) automatically have system privileges when they are called. This is not a security hole in itself, but it increases the Trusted Computing Base significantly, in code that is frequently not audited for security as tightly as DOM core methods.

One flaw I see in the current system is that we distribute the security checks between the scriptsecuritymanager and the method. When we call a method from C++, we don't do any of the nsIXPCSecurityManager security checks (step 2 above), we only do whatever security checks are performed in step 3. As we add additional FooConnect modules to the mix, this problem could be compounded. I will refer to this model as the "two-part model". The following section outlines the "method-based security" model, in which the security manager doesn't perform security checks... each method is expected to perform the security check it needs.

Calling Rules for Secured and Unsecured Code

For the purpose of this section, posite a CAPS context stack. Origins and Asserts can be pushed onto this stack using a binary API. The actual structure of this stack and API are discussed below, but aren't that important. We also are not defining what code is secured and what isn't (see the next section). When I say "origin", I mean any codebase principal, or the system principal for chrome: code, or a signed-JAR principal for signed code.

This is my concept of The Rules for interacting between secured and unsecured code.

  1. Unsecured Code
    1. Unsecured code may call other unsecured code without regard to the security stack.
    2. When unsecured code calls secured code, it must initialize CAPS with a security stack.
      1. This might be a simple stack of one principal (system principal or a codebase)
      2. It could also be a stored stack. For instance, when a DOM timer is set, the DOM code will store the security stack. When the timer is fired, that same security stack will be used.
  2. Secured Code
    1. When secured code is called, it must have a valid security stack. If it doesn't, that's a design/security flaw. See the second bullet above. The secured code must then perform appropriate security checks.
    2. When secured code calls secured code from another origin, The fooconnect for the new code must push the new origin onto the stack.
    3. Secured code may Assert privileges that it needs by pushing them onto the stack. It must pop these privileges off before returning. (Assert is the CAS equivalent to enablePrivilege()).
      1. This needs to be exposed to FooConnect code through an object or somesuch. Or maybe just keep using netscape.securitymanager, need to investigate.
      2. need a stack-based C++ wrapper for this, so that early returns clean up asserts properly
    4. Secured may only call unsecured code if the secured code currently has the system/universalxpconnect privilege. At this point, the security context is invalid and must not be used. We probably want to explicitly invalidate the security stack, at least in debug builds, to catch logic errors.

Defining Secured Code

Currently, we define secured code by QIing the object to nsIClassInfo and nsISecurityCheckedComponent and checking various flags. I call this the "implementation-declared" security model. I believe that it is unsustainable, because C++ code does not have an easy or performant way of knowing whether it was calling a secured or unsecured method. (See the example below for why).

Rather, the security model should be part of the calling convention for a method, and that it rightly belongs on the interface ("interface-declared"). There would be an IDL flag [secured]. This flag does not change the binary calling signature of the method. However, it is an indicator of whether that method expects a security context when it is called. It serves as a coding contract for C++ callers, whether a particular method is secured or unsecured (and therefore whether you need to set a security context to call it from unsecured code). FooConnect checks the typelib and follows the Rules above.

JS components (and future CLR components, including python/whatever implemented through monoconnect) can implement either secured or unsecured interfaces. If it implements an unsecured interface, the code automatically runs with full system privileges (just like today). If it implements a secured interface, it participates in the security model,inherits a security context from the caller, and must Assert any extra privileges it needs (to reduce the TCB).


Pike and I are working towards making some RDF datasources scriptable. To do this, we should mark the nsIRDFDataSource interface [secured]. Each datasource impl would then perform security checks when it is called. If a datasource should only be available to system code (e.g. localstore), the security check can be as simple as CAPS_CheckPrincipal(kUniversalXPConnect). This security check has relatively small overhead.

If we were to do this at the implementation level, every time C++ called a datasource, they would have to QI that datasource to nsIClassInfo2 and call a method to see whether the rdfIDataSource interface was secured. If it was secured, they would have to set up a security stack. That is a lot more code, and slower, than simply pushing the system principal onto the stack and calling the method (push-pops off the security stack will almost-never involve allocations, and will be blazingly fast).

Todo: Implications for Threading and Event-handling

Multi-threaded code will be able to store a security stack from the beginning of a networked activity (e.g. an async through any parsing activities (loading external DTDs, perhaps?) to the event callbacks.

The same kinds of issues apply with loading linked items from the DOM/CSS.

We can apply a special limiting principal for script that "originates" in an onload handler, and similarly for loading linked items from mail messages.

Todo: Implications for XBL

We can give XBL the opportunity to act securely without doing a rigorous security review of each XBL binding. XBL could only do secure activity if it explicitly called Before we do this, we still need to make sure that untrusted content can't modify the trusted anonymous DOM.

Todo: The Security Stack

This section isn't done yet, but it's not that important to the decisions about implementation-defined versus interface-defined security model above.

Each thread would keep a security stack of the current context. The stack can contain objects of the following types:

  • Origin - This indicates a code origin. This can be
    • the system principal
    • a URI-based origin
    • a signed-code origin
    • (maybe more... do we want to reflect code coming from Mono with named privileges in one origin, or use a combination of origin+assert/deny)?
    • There is also a special "universal" origin. This cannot appear directly in the context stack, but is used in Assert/Deny/CheckAccess. It is a magic origin that matches any same-origin check.
  • Assert(origin, named-privilege) - This indicates that the privilege is granted. You can, for instance assert read-access to content at or assert read-access to the universal content (universalbrowserread). This might be too complicated... maybe dump the origin param? it's theoretically nicer but might be a pain to implement, or a performance risk. However, that's how security checking will be done.
  • Deny(origin, named-privilege) - This indicates that the privilege is denied.

When unsecured code starts a new stack, it calls CAPS_StartStack

When unsecured code is finished with the stack frame that calls CAPS_StartStack, it calls CAPS_EndStack to clean up. When performing a security check, call CAPS_CheckAccess(origin, named-privilege). For example, to see if the calling code has access to read the DOM of a page at, call CheckAccess([origin of], "read").

When secured code calls unsecured code, it calls CAPS_SuspendStack. When it is finished calling unsecured code, it calls CAPS_ResumeStack.

Here are a list of the possible named-privileges I know we need so far (I'm sure there are more):

  • "read" (read the DOM or other content belonging to the origin) (UniversalFileRead would be CAPS_CheckAccess([origin for file://], "read"))
  • "write" (write the DOM or other content belonging to the origin)
  • "net-access" (send/receive information from the network location specified by the origin)
  • "link" (link to the network location specified by the origin)

At any point, code may call CAPS_CloneStack() and obtain a "private" copy of the security stack.

The system would make intensive use of atoms to avoid string manipulation. In addition, a cache system should be implemented. (Without actually having done a mockup, I think this system can be blazingly fast compared to the virtual interface dispatch we use now to accomplish the same tasks).