Security/Sandbox/IPCguide

From MozillaWiki
Jump to: navigation, search

This document should serve as a guide for writing secure IPC code in terms of e10s and Sandboxing.

Background: How is a browser compromised?

A short explanation is given here regarding steps to exploit a browser, to give context to the various aspects of the security model. If you are already familiar with how web browsers are typically compromised, feel free to skip this section.

Successfully exploiting a sandboxed browser involves the exploitation of multiple vulnerabilities. Most of the time as a first step, a low privileged process is compromised by achieving arbitrary code execution through corruption of the instruction pointer. In Firefox this would likely be the content process as this is where we load untrusted content from the web. There are many types of issues that lead to a compromised content process: use after free, integer overflow, to name a few. Successfully exploiting an issue like this gives the attacker full control of the content process - that is, the attacker is now only limited by the restrictions that the sandbox has placed on the content process and any security checks inside this content process are no longer effective.

From the attacker's perspective, it is usually not enough to have full control over the content process (few privileges), because there is not much he/she can do since the process is running inside a sandbox. For example the sandbox prohibits direct access to the file system, thus not allowing the attacker to overwrite system files. This requires the attacker to escalate their privileges, leading to step two in the exploitation process.

For escalation of privileges, different routes are possible. The basic gist behind this step is to attack a component of the system with higher privileges. SandboxEscape.png

As shown above, one route is to attack the kernel, however there is not much Mozilla can do to improve the security other than restricting access to the kernel (an improvement currently being integrated into the sandbox).

Another target is the more privileged chrome (parent) process which has direct access to the filesystem and other sensitive OS resources. An attacker able to exploit a vulnerability in the Inter-process communication (IPC) channel shared between chrome and content process, might be able to:

  • Corrupt the chrome process, and run arbitrary code inside

  • Coerce the chrome process to perform privileged actions on the attacker's behalf

If you are writing code that handles data coming from the content process you should always remember that an attacker does not play by the rules. Protecting against these attacks comes down to two things:

  • Writing safe IPC mechanisms which validate data from the content process

  • Writing IPC APIs which don’t expose dangerous functionality to the child

This guide goes into both of these topics in practical terms, and provides common examples from the Firefox codebase.

Safe IPC Mechanisms

IPC safety is mainly an issue when you are writing low-level IPC, such as IPDL. The key tips are:

  • All data coming from a content process is untrusted (careful especially with values like lengths or sizes)

  • Be wary of signed vs unsigned, integer overflow and integer truncation

  • Use safe helper mechanisms where available

All data coming from a content process is untrusted

As you know by now, when an attacker compromised the content process (or any non chrome process for that matter), the attacker is able to do whatever they want as long as it is not restricted by the sandbox. This also means an attacker can send IPC messages that contain unexpected values as compared to normal usage. Let’s look at an example of an IPC message to illustrate what is meant:

Int mValue[10];
...
static bool Read(const Message* aMsg, void** aIter, paramType* aResult) {
...
  ReadParam(aMsg, aIter, &(aResult->mLength));

  for (uint16_t i = 0; i < aResult->mLength; i++) {
     if (!ReadParam(aMsg, aIter, &(aResult->mValue[i]))) {
       return false;
     }
   }
...
}


Here, the content process sends a message to the parent (serialized in aMsg). The first part of the message is the number of values it is sending along with that message. Upon receiving the message, the length is extracted (by calling ReadParam()) and blindly trusted to never exceed 10. However, as stated earlier, an attacker does not play by the rules. An attacker could say, “Hey I am sending you 100 values, here are the values…”, thus leading to a buffer overflow and therefore a memory corruption. Bugs like this are typical of the type of findings from our internal IPC audit - Bug 1236724 was a somewhat typical example of this type of flaw.

Be Wary of Signed, Unsigned, Integer Overflow and Integer Truncation

Sometimes the data coming from a potentially compromised process is validated, however due to integer signedness those validation checks could be bypassed. Let’s take a look at the following example:

static bool Read(const Message* aMsg, void** aIter, paramType* aResult) {
...
  int size;
  char array[20];

  ReadParam(aMsg, aIter, &(size));

  if (size > sizeof(array)) {
    return false;
  }

  for (unsigned int i = 0; i < size; i++) {
     if (!ReadParam(aMsg, aIter, &(array[i]))) {
       return false;
     }
   }
...
}

At first, this seems like a perfectly fine validation, however, since an attacker does not play by the rules, a negative value for ‘size’ is supplied. The comparison with sizeof() is a signed compare (sizeof() and ‘size’ both being signed), negative 1 will always be smaller than the array size. But when entering the for-loop, the comparison is unsigned, leading to an iteration from 0 to UINT_MAX, and thus overflowing the ‘array’ buffer.

An example for an integer overflow would be a compromised process providing width and height of a rectangle, the receiving process would calculate the amount of memory to allocate by multiplying width and height:

static bool Read(const Message* aMsg, void** aIter, paramType* aResult) {
...
  unsigned int width, height;

  ReadParam(aMsg, aIter, &(width));
  ReadParam(aMsg, aIter, &(height));

  char *pixels = (char*)malloc(width * height);

  for (unsigned int i = 0; i < width; i++) {
    for (unsigned int k = 0; k < height; k++) {
       pixels[i][k] = ...;
    }
  }
...
}

With carefully chosen ‘width’ and ‘height’ values, the multiplication can overflow and thus lead to insufficient memory being allocated, leading to memory corruption inside the nested for-loops.

The recommendation here is to use mozilla::CheckedInt for computations on lengths, buffer sizes, and any other untrusted data which could be used to cause memory corruption.

Use helper classes for serialization

Helper classes also exist that take care of serializing and deserializing certain types, such as enums or bit flags. These classes do the proper bound checks and guarantee that the value is indeed valid. So whenever a Recv….() function is called that expects an enum value as an argument, that value is guaranteed to be valid. An example use of such a class can be found here. By using these helper classes, errors are avoided that happen due to improper casting etc.

Do not cast integers to enums

An, unfortunately, all too common pattern is to have an IPC method which takes an integer and then casts it to an enum, without checking bounds:

// Bad code! Do not copy!
// .ipdl
async MyMethod(uint32_t value);

// .cpp

IPCResult ProtocolParent::RecvMyMethod(uint32_t& value) {
  MyEnum m = static_cast<MyEnum>(value);
  // do stuff with m
}

This pattern is not safe! An attacker can send a value for value that is not valid for MyEnum. Then later code will mishandle it!

The solution here is to use the ContiguousEnumSerializer helper class! See bug 1303713 for an example of this vulnerability, and how to fix it.

Be careful when adding capabilities to APIs accessible over IPC

Expose as few capabilities as possible to the child (e.g. content process)

This one is a little tricky to summarise, but :haik phrased this pretty well which I am going to quote here:

When writing an IPC service, one should always think of the child (e.g. content process) as a virtual machine running untrusted code. Ask the question "what is the worst thing that could be done with the API being exposed to the content process?"

Important to remember, an attacker does not play by the rules and an API is going to be abused in whatever way possible.

Let’s look at an example. The firefox sandbox doesn’t allow direct write access to the filesystem. However, the content process needs to write to a file (for whatever reason). In order to do this, an API is exposed to the content process that allows it to write to the filesystem. This is done by sending a file path to the chrome process, the chrome process will then open the file (with write permissions) and send back a file descriptor to the content process. Something like this (code is not working):

ContentParent::RecvOpenFile(const nsString& aFilePath) 
{
  int fd = open(aFilePath.get(), O_RDWR);

  [...send ‘fd’ back to content child...]
}

Now, in the normal use case, this is totally fine, because the content process wouldn’t normally request access to a file it shouldn’t access. Everybody is playing by the rules.

BUT, if you consider the case of the content process being compromised, this pretty much bypasses all the sandbox restrictions in place for protecting write access. Because an attacker can now just request access to any file and will get back a file descriptor from the chrome process.

This issue has also been seen in Message Manager based IPC, for example see bug 1341191.

Whitelist over Blacklist

Often when creating a restricted API, you want to restrict the child to a certain set of input values. Always take the approach of whitelisting valid input, rather than attempting to blacklist bad inputs. For example, if URI handling, you might whitelist a scheme of http: and https: rather than trying to enumerate a list of “bad” schemes which your code will reject.

Some other tips, lessons learned

You will forget it

If you’ve written a whitelist and had to whitelist something weird for it to work and you think it might be able to be improved… file a bug. If you think there’s a better way to do something, or later on you’ll be able to lock something down further but you can’t right now… file a bug.

Capture your thinking and document it, so later on we can revisit it. There’s a good chance you won’t remember later, and we want to be able to ratchet security up more and more in the future!

Plan for Future Lockdown

Sandbox is an iterative process, and the Content Isolation team is continually working to tighten the sandbox to better protect our users. The general process is:

  1. Test the sandbox with a tighter restriction

  2. Fix all the places that are broken with this restriction

  3. Land the restriction

So we don’t want developers landing features which will prevent future sandbox plans.

It’s hard to provide general advice here, as sandboxing is platform specific, but some general things to be aware of:

  • Currently file system access is restricted to read-only on limited locations, but expect all non brokered file access to be forbidden in the child process.

  • On windows, a project is ongoing to restrict Win32 system calls. This includes GDI syscalls (often used in graphics, printing and font loading).

  • Moving to a lower integrity level on windows will prevent some things which are currently not forbidden, notably making network connections.

Plans are evolving so we don’t expect developers to be across the nuances of sandbox, so we won’t be mad if you miss something here. But if you have any doubt about feature you are adding to the content process, the content isolation team would be happy to answer your questions, provide reviews, and give you guidance!