IPC Protocols

From MozillaWiki
Jump to: navigation, search

There will be many IPC protocols, and they should all be as error-free as possible by design. Having separate protocols for separate uses is a good thing, and they all shouldn't be multiplexed by hand over a shared transport layer. Protocol implementations should only have to check for errors as infrequently as possible; the heavy lifting should be done generically, or by code generated from high-level specifications. Finally, explicit protocols make automatically generating test cases (of "normal" and "failure" conditions) much easier.

This work has the imprimatur of the Electrolysis project.

Terminology

NB: this section is intended to provide an informal basis for the rest of this document. The remainder of the document should concretely define each of these terms.

An actor is a protocol end-point. This is something that can send and receive messages. We may have multiple actors per thread and per process. (E.g.: when a thread opens a TCP connection to mozilla.org, the socket object created my kernel is an actor (the ownership of this actor is up to the process containing the creating thread). The socket object created by the mozilla.org server is another actor.

A message is the unit of communication between two actors. A message will probably consist of a message type and a message payload. The message type will specify the actor the message is intended to be sent to, the protocol the message belongs to, and the types of each datum in the message payload. The message type may additionally specify the location of data in the payload; e.g., large data may be transferred through shared memory, while small ones may be sent over a pipe. The message type should make it clear which actor owns each potentially-shared datum.

A protocol is a formal definition of the messages that two, asymmetric actors are allowed to send/receive to/from each other. Protocol semantics can be defined in many different ways, but we should be able to get away with "finite-state machine semantics." This means that each protocol actor has a "current state," and in that state, it is only allowed to send a set of messages SM and receive a set of messages RM. Upon sending or receiving a particular message m, the actor may transition into a new "state."

Desiderata

  1. All messages sent from one actor to another are well-typed. That is, every datum in a message m sent by actor a has a static type, is serializable by an a, is un-serializable by the actor that receives m, and the ownership of the datum is explicit.
  2. All messages received by an actor are well-typed, in the same sense as above. (NB: these goals are complementary. A misbehaving actor could send a non-well-typed message to another actor, and the receiving actor shouldn't suffer because of the non-well-typed message. "Suffer" might include security concerns, so this is important.)
  3. All messages sent from one actor to another adhere to the semantics of the actor's protocol. For the TCP example, this means that an actor in the "CLOSED" state can't send a "FIN" message.
  4. All messages received by an actor adhere to the semantics of the actor's protocol. This means that an actor doesn't have to worry about processing mal-formed messages sent by another, misbehaving actor. Like the message type-safety desideratum, this is likely to be a security concern and is thus important.
  5. Resource ownership should be clear, and how resources are cleanup on errors/crashes should be well-defined. Need to handle the separate cases: cleanup on shutdown, cleanup on crash, cleanup on GC/refcount. (Whether this is a protocol-layer concern is not yet clear.)
  6. Objects should be shared carefully so as to avoid creating cycles and/or leaking information from priveleged->unprivileged processes.

Implementation Overview

For the overall design, we will follow the example of network protocol stacks. On top of the stack is the protocol layer, and below it is the transport layer. (Luckily, much of the transport layer for IPC seems to be provided by chromium.)

The transport layer is responsible for platform-specific code for communication and shared memory, connection establishment, identifying and routing messages to actors, and type safety of individual messages. Please refer to [http:://example.com TODO] for its specification.

We will follow the example of Sing# for protocol specifications. Since C++ doesn't support this kind of code, and it's extremely hard to make it do so, and we may want to allow other languages (JavaScript) to implement protocol actors, we will implement protocol specifications as a domain-specific language. (We did the same thing for IDL.) The remainder of this document focuses on this protocol-specification language.

Protocol-definition Language (PDL)

A protocol is between a parent actor and a child actor. A protocol is specified with respect to the parent; that is, messages the parent is allowed to receive are exactly the messages a child is allowed to sent, and vice versa.

A protocol consists of declarations of messages and specifications of state machine transitions. (This ties us to state-machine semantics.) The message declarations are essentially type declarations for the transport layer, and the state machine transitions capture the semantics of the protocol itself.

A protocol is specified by first declaring the protocol

Protocol :: (\epsilon | 'sync' | 'rpc') 'protocol' ProtocolName '{' Messages Transitions '}'

This implies the creation of FooParent and FooChild actors. (Hereafter, this document speaks from the perspective of the FooParent.)

By default, protocols can only exchange asynchronous messages. A protocol must explicitly allow synchronous and RPC (see below) messages by using the sync or rpc qualifiers. This is enforced statically by the PDL type system.

Conceptually (but not necessarily syntactically) next are message definitions. Messages definitions are somewhat analogous to function signatures in C/C++. Messages can have one of three semantics

  • asynchronous: the sending actor does not expect nor listen for a response to the sent message
  • synchronous: the sending actor is completely blocked until it receives a response
  • RPC++: the sending actor is partially blocked until it receives a response to message m. It is only allowed to process RPC++ messages sent by the actor receiving m, direct resulting from the receiving actor receiving m. (This is intended to model function call semantics.)

Asynchronous messages are the default. The list above is sorted by decreasing simplicity and efficiency; synchronous and RPC++ messages should not be used without a compelling reason.

Strawman message grammar

 Message :: (\epsilon | 'sync' | 'rpc') MessageBody ';'
 MessageBody :: Type MessageName '(' MessageArguments ')'
 
 MessageArguments :: (MessageArgument ',')* MessageArgument?
 MessageArgument :: Type Identifier
 
 Type :: SharingQualifier BasicType | ActorHandle
 SharingQualifier :: (\epsilon | 'transfer' | 'share')
 BasicType :: BuiltinType | ImportedType
 ActorHandle :: ('parent' | 'child') 'actor' ProtocolName (':' StateName)?

A few items are worth calling out.

SharingQualifiers define transport semantics for objects sent in a message. By default, objects are sent "by value" (i.e., marshalled then unmarshalled). How we classes are marshalled is not a concern of the protocol layer, but very important nonetheless. This is likely to be another security concern. But large objects can also be transported through shared memory.

The qualifier share means that the object o named lives in shared memory, and is co-owned by the parent and child actors. If the receiving actor does not already co-own o, it does after receiving the message. A lower layer needs to enforce that this is implemented correctly

  1. o lives in shared memory
  2. all objects reachable from o live in shared memory
  3. all accesses to members of o are synchronized across the client and server

transfer means that the sending actor owns o, and when the receiving actor receives o, ownership transfers from the sender to the receiver. This means that requirement (3) above is removed for transfer types. This is the preferred sharing semantics; share probably won't be implemented initially.

NOTE: share and transfer are optimizations. These don't need to be included in the initial language implementation, but are worth keeping in mind.

A BasicType is a C++ type that can be transferred in a message. We will provide a set of BuiltinTypes like void, int, and so on. Protocol writers can also import foreign types for which marshall/unmarshall traits are defined, and/or that can be allocated in shared memory. (We may need to support union types for NPAPI.)

An ActorHandle is a handle to an already-created protocol actor; this can be thought of by analogy to sockaddr_t; it's a way to refer to a created actor. The actor is either a parent or child, and may be required to be in a particular state. TODO: the semantics of this still aren't 100% clear to me. I'll need to work through a concrete example to figure them out.

If desired, we can add a notion of "privilege" to the message type system. For example, chrome is more privileged than content, which is more privileged than plugin. With this system, we can cheaply guarantee that privileged objects don't flow to unprivileged actors.

Strawman transition grammar

 Transition :: 'state' StateName '{' Actions '}'
 Actions :: (Action ';')* Action?  
 Action :: MessageAction | RPCAction
 
 MessageAction :: ('send' | 'rcv') MessageName 'goto' StateName
 RPCAction :: ('call' | 'answer') MessageName ('push' StateName)?

TODO: the above grammar may lead to unnecessarily verbose specifications, since there's only one "action" permitted per state transition. We can add additional syntax to reduce verbosity if it becomes a problem.

A transition starts from a particular state (the lexically first state is the start state), and then either sends or receives ("calls" or "answers" for RPC) one of a set of allowed messages. The syntax send MessageName means "send MessageName", and rcv MessageName means "receive MessageName". After the action, the actor then transitions into another state. For RPC, an action causes the current state to be pushed on a stack, then the "push STATE" to be transitioned into.

Unfortunately, the syntax for async/sync messages and RPC calls diverge because the semantics are so different. Sync/async messages only model message passing, whereas RPC models function calls. After a message-passing action occurs, the actor only makes a state transition (goto STATE). However, an RPC action pushes a new state onto an "RPC stack" (push STATE). When the RPC call returns, the "pushed" state is "popped."

TODO: this may be confusing. Any ideas for simplifying it?

We can check almost any property of the protocol specification itself statically, since it's a state machine + well-defined stack. What all of these static invariants should be is not yet known; one invariant is that an asynchronous message can't be nested within a synchronous one. From this static specification, we can generate a C++ dynamic checker to ensure that message processors (code utilizing actors) adhere to the protocol spec. We may be able to check some of this statically as well, but it's harder.

TODO: there are more things we can integrate into the transition grammar, but concrete use cases are necessary.

Error Handling

Errors include

  • Message type errors: bad data sent
  • Protocol errors: e.g., wrong message, wrong state
  • Transport errors: e.g., nesting asynchronicity within synchronicity
  • Security errors: e.g., trying to send privileged info to plugins. TODO: this is not officially part of the design yet

Error handling is different for parents and children. Child actors are not trusted, and if the parent detects the least whiff of malfeasance, the parent will invoke child.KILLITWITHFIRE(). Code utilizing the child actor therefore will never see type or protocol errors; if they occur, the child is killed.

If the child detects a parent error, there's not much it can do. The child will execute self.SEPPUKU(), but we may wish to have it first attempt to notify the parent of the error.

The parent actor must, however, carefully handle and recover from errors.

TODO more details

Deadlocks

Deadlocks can be caused by bad sequences of synchronous messages (possibly RPC also, but this isn't completely understood yet). Luckily, it is easy and cheap to detect whether sending a synchronous IPC message will immediately cause deadlock.The algorithm is:

 Actor:  # something that can receive a synchronous message
   Multiset blocked = { };  # Actors blocked on this target
 
   synchronousSend(target, msg):
     blocked = blocked.union(self)
     if target in blocked:
       deadlock!
     transportLayer.send(blocked.serialize(), msg.serialize())
     blocked = blocked.difference(self)
 
   synchronousReceive(msg):
     blocked = blocked.union(msg.blocked)
     # [1] handle message
     transportLayer.reply()
     blocked = blocked.difference(msg.blocked)

Note that the blocked variable needs to be a Multiset, and Multiset.difference() needs to only remove one instance of each element in the differenced set, to make |synchronousReceive()| work correctly. This should be efficient because the |blocked| multiset shouldn't grow very big, and we should be able to identify an IPC target by perhaps a 4-byte ID. So serializing |blocked| is cheap.

Doing a dynamic "may-deadlock" analysis, like what we for locks and monitors, is more expensive. First, we need to keep state that is global across all processes. Second, when calling |synchronousSend()|, we need to send another asynchronous IPC message to whichever process manages the global state. But it's doable, and we can choose the granularity at which we detect possible deadlocks:

  1. Actor x may deadlock with y
  2. Protocol x may deadlock with y
  3. ProtocolMessage x may deadlock with y

Each of these represents a compromise of precision with number of deadlocks detected. Option (1) detects the most deadlocks, but is very imprecise. Option (3) detects the fewest, but is most precise. Option (2) is somewhere in the middle. It may be best to start with option (1), and if there are too many false positives, go down to (2). And again, if there are too many false positives, we can switch to (3).

Doing a static "may-deadlock" analysis appears to be hard, but tractable (for two processes at least). Researchers have done similar things in the past, and it is known approximately what is required. However, we will need some machinery in our static analysis toolkit that currently doesn't exist.

Implementation

The transport layer should hopefully be mainly chromium IPC code. We will likely need to extend this to support objects sent through shared memory, if that feature is desired.

The PDL compiler will take as input a FooProtocol.pdl source and generate FooProtocolParent.h and FooProtocolChild.h C++ headers. These headers will declare message types for the transport layer, declare protocol states, and define functions that take care of state transitions and invoke message handlers. The headers will leave undefined the implementations of message handlers; these are provided by code utilizing the protocol actors. (Perhaps through subclassing.)

Message handlers will be responsible for processing the received message (obviously) and also deciding which of the allowed states is transitioned to after processing. The generated protocol header will enforce protocol safety here. Lower-level enforcement of transport-layer properties can be enforced in a shared ProtocolBase class. Hopefully much of this we can check statically. Some of these properties are

  • not nesting asynchronous calls within synchronous handlers
  • avoid deadlock
  • don't fill the underlying pipe (rate limit)

Protocol safety (only sending/receiving allowed messages from particular states) will be enforced by generating a finite-state machine in the protocol header, and checking sent/received messages and transitions against this state machine.

An actor will just be an instance of a protocol server or client class. We need some way of addressing particular actors across threads and processes. Hopefully the chromium IPC code already supports this. Also, we can easily support process-to-process and thread-to-thread communication over these protocols. The latter would be very useful for debugging, and also useful for writing threaded code that avoids shared memory as much as possible (even when we have multiple processes we will still have multiple threads). Due to its apparent difficulty, this is a low-priority TODO.

Protocol instances will not be thread safe in the usual multi-threaded, shared-memory sense; they will instead be "thread safe" in the sense of HTML5 Worker Threads: share-nothing. We should use these protocols to avoid multi-threaded, shared-memory code.

We will do much of our static checking with the SPIN model checker. Our PDL compiler will know how to generate SPIN models, and additionally, we can use treehydra to create more complicated models from the Mozilla source itself. This could be used to check invariants across protocols in addition to within protocols.