Confirmed users
699
edits
No edit summary |
|||
(7 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
There will be many IPC protocols, and they should all be as error-free as possible by design. Having separate protocols for separate uses is a good thing, and they all shouldn't be multiplexed by hand over a shared transport layer. Protocol implementations should only have to check for errors as infrequently as possible; the heavy lifting should be done generically, or by code generated from high-level specifications. Finally, explicit protocols make automatically generating test cases (of "normal" ''and'' "failure" conditions) much easier. | There will be many IPC protocols, and they should all be as error-free as possible by design. Having separate protocols for separate uses is a good thing, and they all shouldn't be multiplexed by hand over a shared transport layer. Protocol implementations should only have to check for errors as infrequently as possible; the heavy lifting should be done generically, or by code generated from high-level specifications. Finally, explicit protocols make automatically generating test cases (of "normal" ''and'' "failure" conditions) much easier. | ||
This work has the imprimatur of the [[Content Processes|'''Electrolysis''']] project. | |||
== Terminology == | == Terminology == | ||
Line 30: | Line 32: | ||
== Protocol-definition Language (PDL) == | == Protocol-definition Language (PDL) == | ||
A protocol | A protocol is between a ''parent'' actor and a ''child'' actor. A protocol is specified with respect to the parent; that is, messages the parent is allowed to receive are exactly the messages a child is allowed to sent, and ''vice versa''. | ||
''' | A protocol consists of declarations of ''messages'' and specifications of ''state machine transitions''. (This ties us to state-machine semantics.) The message declarations are essentially type declarations for the transport layer, and the state machine transitions capture the semantics of the protocol itself. | ||
A protocol | A protocol is specified by first declaring the protocol | ||
Protocol :: (\epsilon | 'sync' | 'rpc') 'protocol' ProtocolName '{' Messages Transitions '}' | |||
This implies the creation of <code>FooParent</code> and <code>FooChild</code> actors. (Hereafter, this document speaks from the perspective of the <code>FooParent</code>.) | |||
By default, protocols can only exchange asynchronous messages. A protocol must explicitly allow synchronous and RPC (see below) messages by using the <code>sync</code> or <code>rpc</code> qualifiers. This is enforced statically by the PDL type system. | |||
Conceptually (but not necessarily syntactically) next are message definitions. | Conceptually (but not necessarily syntactically) next are message definitions. Messages definitions are somewhat analogous to function signatures in C/C++. Messages can have one of three semantics | ||
* '''asynchronous''': the sending actor does not expect nor listen for a response to the sent message | * '''asynchronous''': the sending actor does not expect nor listen for a response to the sent message | ||
* '''synchronous''': the sending actor is completely blocked until it receives a response | * '''synchronous''': the sending actor is completely blocked until it receives a response | ||
* '''RPC++''': the sending actor is partially blocked until it receives a response to message ''m''. It is only allowed to process RPC++ messages sent by the actor receiving ''m'', direct resulting from the receiving actor receiving ''m''. (This is intended to model function call semantics.) | * '''RPC++''': the sending actor is partially blocked until it receives a response to message ''m''. It is only allowed to process RPC++ messages sent by the actor receiving ''m'', direct resulting from the receiving actor receiving ''m''. (This is intended to model function call semantics.) | ||
Asynchronous messages are the default. The list above is sorted by decreasing simplicity and efficiency; synchronous and RPC++ messages should not be used without a compelling reason. | |||
=== Strawman message grammar === | === Strawman message grammar === | ||
Message :: ( | Message :: (\epsilon | 'sync' | 'rpc') MessageBody ';' | ||
MessageBody :: Type MessageName '(' MessageArguments ')' | |||
MessageArguments :: (MessageArgument ',' | MessageArguments :: (MessageArgument ',')* MessageArgument? | ||
MessageArgument :: Type Identifier | MessageArgument :: Type Identifier | ||
Type :: SharingQualifier BasicType | Type :: SharingQualifier BasicType | ActorHandle | ||
SharingQualifier :: (' | SharingQualifier :: (\epsilon | 'transfer' | 'share') | ||
BasicType :: (' | BasicType :: BuiltinType | ImportedType | ||
ActorHandle :: ('parent' | 'child') 'actor' ProtocolName (':' StateName)? | |||
A few items are worth calling out. | A few items are worth calling out. | ||
SharingQualifiers define transport semantics for objects sent in a message. By default, objects are sent "by value" (i.e., marshalled then unmarshalled). How we classes are marshalled is not a concern of the protocol layer, but very important nonetheless. This is likely to be another security concern. But large objects can also be transported through shared memory. | |||
'''share''' means that the object ''o'' named lives in shared memory, and is co-owned by the | The qualifier '''share''' means that the object ''o'' named lives in shared memory, and is co-owned by the parent and child actors. If the receiving actor does not already co-own ''o'', it does after receiving the message. A lower layer needs to enforce that this is implemented correctly | ||
# ''o'' lives in shared memory | # ''o'' lives in shared memory | ||
# all objects reachable from ''o'' live in shared memory | # all objects reachable from ''o'' live in shared memory | ||
# all accesses to members of ''o'' are synchronized across the client and server | # all accesses to members of ''o'' are synchronized across the client and server | ||
'''transfer''' means that the sending actor owns ''o'', and when the receiving actor receives ''o'', ownership transfers from the sender to the receiver. This means that requirement (3) above is removed for '''transfer''' types. | '''transfer''' means that the sending actor owns ''o'', and when the receiving actor receives ''o'', ownership transfers from the sender to the receiver. This means that requirement (3) above is removed for '''transfer''' types. This is the preferred sharing semantics; '''share''' probably won't be implemented initially. | ||
'''NOTE''': '''share''' and '''transfer''' are optimizations. These don't need to be included in the initial language implementation, but are worth keeping in mind. | |||
A BasicType is a C++ type that can be transferred in a message. We will provide a set of BuiltinTypes like void, int, and so on. Protocol writers can also ''import'' foreign types for which marshall/unmarshall traits are defined, and/or that can be allocated in shared memory. (We may need to support union types for NPAPI.) | |||
''' | An ActorHandle is a handle to an already-created protocol actor; this can be thought of by analogy to <code>sockaddr_t</code>; it's a way to refer to a created actor. The actor is either a ''parent'' or ''child'', and may be required to be in a particular state. '''TODO''': the semantics of this still aren't 100% clear to me. I'll need to work through a concrete example to figure them out. | ||
' | If desired, we can add a notion of "privilege" to the message type system. For example, chrome is more privileged than content, which is more privileged than plugin. With this system, we can cheaply guarantee that privileged objects don't flow to unprivileged actors. | ||
=== Strawman transition grammar === | === Strawman transition grammar === | ||
Transition :: 'state' StateName '{' Actions '}' | Transition :: 'state' StateName '{' Actions '}' | ||
Actions :: (Action ';' | | Actions :: (Action ';')* Action? | ||
Action :: MessageAction | RPCAction | |||
MessageAction :: ('send' | 'rcv') MessageName 'goto' StateName | |||
RPCAction :: ('call' | 'answer') MessageName ('push' StateName)? | |||
'''TODO''': the above grammar may lead to unnecessarily verbose specifications, since there's only one "action" permitted per state transition. We can add additional syntax to reduce verbosity if it becomes a problem. | |||
A transition starts from a particular state (the lexically first state is the start state), and then either sends or receives ("calls" or "answers" for RPC) one of a set of allowed messages. The syntax <code>send MessageName</code> means "send MessageName", and <code>rcv MessageName</code> means "receive MessageName". After the action, the actor then transitions into another state. For RPC, an action causes the current state to be pushed on a stack, then the "push STATE" to be transitioned into. | |||
Unfortunately, the syntax for async/sync messages and RPC calls diverge because the semantics are so different. Sync/async messages only model message passing, whereas RPC models function calls. After a message-passing action occurs, the actor only makes a state transition (<code>goto STATE</code>). However, an RPC action pushes a new state onto an "RPC stack" (<code>push STATE</code>). When the RPC call returns, the "pushed" state is "popped." | |||
'''TODO''': this may be confusing. Any ideas for simplifying it? | |||
We can check almost any property of the protocol specification itself statically, since it's a state machine + well-defined stack. What all of these static invariants should be is not yet known; one invariant is that an asynchronous message can't be nested within a synchronous one. From this static specification, we can generate a C++ dynamic checker to ensure that message processors (code utilizing actors) adhere to the protocol spec. We may be able to check some of this statically as well, but it's harder. | |||
'''TODO''': there are more things we can integrate into the transition grammar, but concrete use cases are necessary. | |||
== Error Handling == | |||
Errors include | |||
* Message type errors: bad data sent | |||
* Protocol errors: e.g., wrong message, wrong state | |||
* Transport errors: e.g., nesting asynchronicity within synchronicity | |||
* Security errors: e.g., trying to send privileged info to plugins. '''TODO''': this is not officially part of the design yet | |||
Error handling is different for parents and children. Child actors are not trusted, and if the parent detects the least whiff of malfeasance, the parent will invoke <code>child.KILLITWITHFIRE()</code>. Code utilizing the child actor therefore will never see type or protocol errors; if they occur, the child is killed. | |||
If the child detects a parent error, there's not much it can do. The child will execute <code>self.SEPPUKU()</code>, but we may wish to have it first attempt to notify the parent of the error. | |||
The parent actor must, however, carefully handle and recover from errors. | |||
'''TODO''' more details | |||
== Deadlocks == | |||
Deadlocks can be caused by bad sequences of synchronous messages (possibly RPC also, but this isn't completely understood yet). Luckily, it is easy and cheap to detect whether sending a synchronous IPC message will immediately cause deadlock.The algorithm is: | |||
Actor: # something that can receive a synchronous message | |||
Multiset blocked = { }; # Actors blocked on this target | |||
synchronousSend(target, msg): | |||
blocked = blocked.union(self) | |||
if target in blocked: | |||
deadlock! | |||
transportLayer.send(blocked.serialize(), msg.serialize()) | |||
blocked = blocked.difference(self) | |||
synchronousReceive(msg): | |||
blocked = blocked.union(msg.blocked) | |||
# [1] handle message | |||
transportLayer.reply() | |||
blocked = blocked.difference(msg.blocked) | |||
Note that the <code>blocked</code> variable needs to be a Multiset, and <code>Multiset.difference()</code> needs to only remove one instance of each element in the differenced set, to make |synchronousReceive()| work correctly. This should be efficient because the |blocked| multiset shouldn't grow very big, and we should be able to identify an IPC target by perhaps a 4-byte ID. So serializing |blocked| is cheap. | |||
Doing a dynamic "may-deadlock" analysis, like what we for locks and monitors, is more expensive. First, we need to keep state that is global across all processes. Second, when calling |synchronousSend()|, we need to send another asynchronous IPC message to whichever process manages the global state. But it's doable, and we can choose the granularity at which we detect possible deadlocks: | |||
# Actor x may deadlock with y | |||
# Protocol x may deadlock with y | |||
# ProtocolMessage x may deadlock with y | |||
Each of these represents a compromise of precision with number of deadlocks detected. Option (1) detects the most deadlocks, but is very imprecise. Option (3) detects the fewest, but is most precise. Option (2) is somewhere in the middle. It may be best to start with option (1), and if there are too many false positives, go down to (2). And again, if there are too many false positives, we can switch to (3). | |||
Doing a static "may-deadlock" analysis appears to be hard, but tractable (for two processes at least). Researchers have done similar things in the past, and it is known approximately what is required. However, we will need some machinery in our static analysis toolkit that currently doesn't exist. | |||
== Implementation == | == Implementation == | ||
Line 104: | Line 155: | ||
The transport layer should hopefully be mainly chromium IPC code. We will likely need to extend this to support objects sent through shared memory, if that feature is desired. | The transport layer should hopefully be mainly chromium IPC code. We will likely need to extend this to support objects sent through shared memory, if that feature is desired. | ||
The PDL compiler will take as input a FooProtocol.pdl source and generate | The PDL compiler will take as input a FooProtocol.pdl source and generate FooProtocolParent.h and FooProtocolChild.h C++ headers. These headers will declare message types for the transport layer, declare protocol states, and define functions that take care of state transitions and invoke message handlers. The headers will leave undefined the ''implementations'' of message handlers; these are provided by code utilizing the protocol actors. (Perhaps through subclassing.) | ||
Message handlers will be responsible for processing the received message (obviously) and also deciding which of the allowed states is transitioned to after processing. The generated protocol header will enforce protocol safety here. Lower-level enforcement of transport-layer properties can be enforced in a shared ProtocolBase class. Hopefully much of this we can check statically. Some of these properties are | |||
* not nesting asynchronous calls within synchronous handlers | |||
* avoid deadlock | |||
* don't fill the underlying pipe (rate limit) | |||
Protocol safety (only sending/receiving allowed messages from particular states) will be enforced by generating a finite-state machine in the protocol header, and checking sent/received messages and transitions against this state machine. | |||
''' | An ''actor'' will just be an instance of a protocol server or client class. We need some way of addressing particular actors across threads and processes. Hopefully the chromium IPC code already supports this. Also, we can easily support process-to-process and thread-to-thread communication over these protocols. The latter would be very useful for debugging, and also useful for writing threaded code that avoids shared memory as much as possible (even when we have multiple processes we will still have multiple threads). Due to its apparent difficulty, this is a low-priority TODO. | ||
Protocol instances will not be thread safe in the usual multi-threaded, shared-memory sense; they will instead be "thread safe" in the sense of HTML5 Worker Threads: share-nothing. We should use these protocols to avoid multi-threaded, shared-memory code. | |||
We will do much of our static checking with the [http://spinroot.com/spin/whatispin.html SPIN model checker]. Our PDL compiler will know how to generate SPIN models, and additionally, we can use treehydra to create more complicated models from the Mozilla source itself. This could be used to check invariants across protocols in addition to within protocols. |