Changes

Jump to: navigation, search

IPC Protocols

1,060 bytes added, 22:52, 15 May 2009
Protocol-definition Language (PDL)
== Protocol-definition Language (PDL) ==
A protocol has is between a ''serverparent'' actor and a ''clientchild''actor. Loosely speaking for IPCChild actors are not trusted, and if there is any evidence the ''client'' '''MAY BE''' child is misbehaving, it is terminated. Code utilizing the process that forks child actor therefore will never see type or protocol errors; if they occur, the ''server'' processchild is killed. Again loosely speakingThe parent actor must, however, the ''server'' provides capabilities not inherent to the ''client''carefully handle errors.
A protocol is specified with respect to the parent; that is, messages the parent is allowed to receive are exactly the messages a child is allowed to sent, and ''vice versa'TODO''': this isn't a complete definition. Forking a plugin process fits it, but a chrome-local addon to content doesn't.
A protocol is specified with respect to the server; that is, consists of declarations of ''messages the server is allowed '' and specifications of ''state machine transitions''. (This ties us to receive state-machine semantics.) The message declarations are exactly essentially type declarations for the messages a client is allowed to senttransport layer, and ''vice versa''the state machine transitions capture the semantics of the protocol itself.
A protocol consists of declarations of ''messages'' and specifications of ''state machine transitions''. (This ties us to state-machine semantics.) The message declarations are essentially type declarations for the transport layer, and the state machine transitions capture the semantics of is specified by first declaring the protocol itself.
A protocol is specified by first naming the Protocol :: (\epsilon | 'sync' | 'rpc') 'protocol' ProtocolName '{' Messages Transitions '}'
This implies the creation of <code>FooParent</code> and <code>FooChild</code> actors. protocol Foo {(Hereafter, this document speaks from the perspective of the <code>FooParent</code>.)
This implies By default, protocols can only exchange asynchronous messages. A protocol must explicitly allow synchronous and RPC (see below) messages by using the creation of a <code>FooServersync</code> and or <code>FooClientrpc</code>qualifiers. (Hereafter, this document speaks from This is enforced statically by the perspective of the <code>FooServer</code>PDL type system.)
Conceptually (but not necessarily syntactically) next are message definitions. What underlying types we allow Messages definitions are somewhat analogous to function signatures in these, and with what qualifiers, is likely to be a central topic of debateC/C++. Another hot topic will likely be what message Messages can have one of three semantics we provide; possibilities are
* '''asynchronous''': the sending actor does not expect nor listen for a response to the sent message
* '''synchronous''': the sending actor is completely blocked until it receives a response
* '''RPC++''': the sending actor is partially blocked until it receives a response to message ''m''. It is only allowed to process RPC++ messages sent by the actor receiving ''m'', direct resulting from the receiving actor receiving ''m''. (This is intended to model function call semantics.)
(From e-mail discussions, it appears that we may want RPC++ Asynchronous messages, excluding are the default. The list above is sorted by decreasing simplicity and efficiency; synchronous messages (since they're a special case of and RPC++ messages). However, I'm should not convinced RPC++ is necessary, and I'm writing the strawman grammar below assuming we only require synchronous messagesbe used without a compelling reason.)
=== Strawman message grammar ===
Message :: (SyncMessage \epsilon | AsyncMessage) ';' SyncMessage :: 'sync' ('in' | 'outrpc') Type MessageName MessageBody '(' MessageArguments ');' AsyncMessage MessageBody :: 'async' ('in' | 'out') Type MessageName '(' MessageArguments ')'
MessageArguments :: (MessageArgument ',' | \epsilon)*MessageArgument?
MessageArgument :: Type Identifier
Type :: SharingQualifier BasicType
SharingQualifier :: (\epsilon | 'sharetransfer' | 'transfershare' | \epsilon) BasicType :: ('void' | 'int' BuiltinType | ... ???)ImportedType
A few items are worth calling out.
* As mentioned above, will SyncMessage be sufficient SharingQualifiers define transport semantics for us?* SyncMessages have objects sent in a return typemessage. By default, objects are sent "by value" (i.e., whereas AsyncMessages don'tmarshalled then unmarshalled).* SharingQualifiers How we classes are marshalled is not a discussion unto themselvesconcern of the protocol layer, but very important nonetheless. This is likely to be another security concern. But large objects can also be transported through shared memory.
The qualifier '''share''' means that the object ''o'' named lives in shared memory, and is co-owned by the client parent and serverchild actors. If the receiving actor does not already co-own ''o'', it does after receiving the message. A lower layer needs to enforce that this is implemented correctly:
# ''o'' lives in shared memory
# all objects reachable from ''o'' live in shared memory
# all accesses to members of ''o'' are synchronized across the client and server
'''transfer''' means that the sending actor owns ''o'', and when the receiving actor receives ''o'', ownership transfers from the sender to the receiver. This means that requirement (3) above is removed for '''transfer''' types. No SharingQualifier (\epsilon) means that that object sent is serialized. How we classes are serialized This is the preferred sharing semantics; '''share''' probably not a concern of the protocol layer, but very important nonetheless. This is likely to won't be another security concernimplemented initially.
'''NOTE''': '''share''' and '''transfer''' are optimizations. These don't need to be included in the initial language implementation, but are worth keeping in mind.
A BasicType is a C++ type that can be transferred in a message. We will provide a set of BuiltinTypes like void, int, and so on. Protocol writers can also ''import'NOTE''': what <code>BasicType<foreign types for which marshall/unmarshall traits are defined, and/code> means should or that can be a fruitful topic for discussionallocated in shared memory.
=== Strawman transition grammar ===
Transition :: 'state' StateName '{' Actions '}'
Actions :: (Action ';' )* Action? Action :: MessageAction | \epsilon)RPCAction
Action MessageAction :: ('send' | 'rcv') MessageName 'goto' StateName RPCAction :: ('!call' | '?answer') MessageName ('->push' StateName)? '''TODO''': the above grammar may lead to unnecessarily verbose specifications, since there's only one "action" permitted per state transition. We can add additional syntax to reduce verbosity if it becomes a problem.
This is a dirt-simple grammar but should capture all we need in a first pass. A transition starts from a particular state (the lexically first state is the start state), and then either sends or receives ("calls" or "answers" for RPC) one of a set of allowed messages. The syntax <code>send MessageName !</code> means "send MessageName", and <code>rcv MessageName ?</code> means "receive MessageName". The After the action, the actor then transitions into another state. For RPC, an action causes the current state to be pushed on a stack, then the "push STATE" to be transitioned into.
From Unfortunately, the syntax for async/sync messages and RPC calls diverge because the semantics are so different. Sync/async messages only model message passing, whereas RPC models function calls. After a particular statemessage-passing action occurs, an the actor can either ''only receive'' or ''only send'' messages makes a state transition (we could relax this<code>goto STATE</code>). However, but it complicates the implementationan RPC action pushes a new state onto an "RPC stack" (<code>push STATE</code>). This is extremely easy to check statically (we could make it part of When the grammarRPC call returns, too)the "pushed" state is "popped."
Transitions only happen when the underlying message operation was "completed." For messages sent asynchronously, this means sent over the wire (resp., received). For messages sent synchronously, this means sent over the wire ''and'TODO''' replied to by the other side (resp., received and reply sent): this may be confusing. Any ideas for simplifying it?
We can support sending/receiving multiple messages per transitioncheck almost any property of the protocol specification itself statically, since it's a state machine + well-defined stack. As What all of these static invariants should be is not yet known; one invariant is that an asynchronous message can't be nested within a synchronous one. From this complicates static specification, we can generate a C++ dynamic checker to ensure that message processors (code utilizing actors) adhere to the implementationprotocol spec. We may be able to check some of this statically as well, but it's probably best to add that only when necessaryharder.
'''TODO''': there are many more things we can integrate hereinto the transition grammar, but concrete use cases are necessary. This should be a main point of discussion.
== Implementation ==
Confirm
699
edits

Navigation menu