Changes

IPC Protocols

1,060 bytes added, 22:52, 15 May 2009

→‎Protocol-definition Language (PDL)

== Protocol-definition Language (PDL) ==

A protocol ~~has~~ is between a ''~~server~~parent'' actor and a ''~~client~~child''actor. ~~Loosely speaking for IPC~~Child actors are not trusted, and if there is any evidence the ~~''client'' '''MAY BE'''~~ child is misbehaving, it is terminated. Code utilizing the ~~process that forks~~ child actor therefore will never see type or protocol errors; if they occur, the ~~''server'' process~~child is killed. ~~Again loosely speaking~~The parent actor must, however, ~~the ''server'' provides capabilities not inherent to the ''client''~~carefully handle errors.

A protocol is specified with respect to the parent; that is, messages the parent is allowed to receive are exactly the messages a child is allowed to sent, and ''vice versa'~~TODO''': this isn't a complete definition. Forking a plugin process fits it, but a chrome-local addon to content doesn~~'t.

A protocol ~~is specified with respect to the server; that is,~~ consists of declarations of ''messages ~~the server is allowed~~ '' and specifications of ''state machine transitions''. (This ties us to ~~receive~~ state-machine semantics.) The message declarations are ~~exactly~~ essentially type declarations for the ~~messages a client is allowed to sent~~transport layer, and ~~''vice versa''~~the state machine transitions capture the semantics of the protocol itself.

A protocol consists of declarations of ''messages'' and specifications of ''state machine transitions''. (This ties us to state-machine semantics.) The message declarations are essentially type declarations for the transport layer, and the state machine transitions capture the semantics of is specified by first declaring the protocol ~~itself.~~

~~A protocol is specified by first naming the~~ Protocol :: (\epsilon | 'sync' | 'rpc') 'protocol' ProtocolName '{' Messages Transitions '}'

This implies the creation of <code>FooParent</code> and <code>FooChild</code> actors. ~~protocol Foo {~~(Hereafter, this document speaks from the perspective of the <code>FooParent</code>.)

~~This implies~~ By default, protocols can only exchange asynchronous messages. A protocol must explicitly allow synchronous and RPC (see below) messages by using the ~~creation of a~~ <code>~~FooServer~~sync</code> ~~and~~ or <code>~~FooClient~~rpc</code>qualifiers. ~~(Hereafter, this document speaks from~~ This is enforced statically by the ~~perspective of the <code>FooServer</code>~~PDL type system.)

Conceptually (but not necessarily syntactically) next are message definitions. ~~What underlying types we allow~~ Messages definitions are somewhat analogous to function signatures in ~~these, and with what qualifiers, is likely to be a central topic of debate~~C/C++. ~~Another hot topic will likely be what message~~ Messages can have one of three semantics ~~we provide; possibilities are~~

* '''asynchronous''': the sending actor does not expect nor listen for a response to the sent message

* '''synchronous''': the sending actor is completely blocked until it receives a response

* '''RPC++''': the sending actor is partially blocked until it receives a response to message ''m''. It is only allowed to process RPC++ messages sent by the actor receiving ''m'', direct resulting from the receiving actor receiving ''m''. (This is intended to model function call semantics.)

~~(From e-mail discussions, it appears that we may want RPC++~~ Asynchronous messages~~, excluding~~ are the default. The list above is sorted by decreasing simplicity and efficiency; synchronous ~~messages (since they're a special case of~~ and RPC++ messages~~). However, I'm~~ should not ~~convinced RPC++ is necessary, and I'm writing the strawman grammar below assuming we only require synchronous messages~~be used without a compelling reason.)

=== Strawman message grammar ===

Message :: (~~SyncMessage~~ \epsilon | ~~AsyncMessage) ';'~~ ~~SyncMessage ::~~ 'sync~~' ('in~~' | '~~out~~rpc') ~~Type MessageName~~ MessageBody '~~(' MessageArguments ')~~;' ~~AsyncMessage~~ MessageBody :: ~~'async' ('in' | 'out')~~ Type MessageName '(' MessageArguments ')'

MessageArguments :: (MessageArgument ',' ~~| \epsilon~~)*MessageArgument?

MessageArgument :: Type Identifier

Type :: SharingQualifier BasicType

A few items are worth calling out.

* As mentioned above, will SyncMessage be sufficient SharingQualifiers define transport semantics for ~~us?~~* SyncMessages have objects sent in a ~~return type~~message. By default, objects are sent "by value" (i.e., ~~whereas AsyncMessages don't~~marshalled then unmarshalled).* SharingQualifiers How we classes are marshalled is not a ~~discussion unto themselves~~concern of the protocol layer, but very important nonetheless. This is likely to be another security concern. But large objects can also be transported through shared memory.

The qualifier '''share''' means that the object ''o'' named lives in shared memory, and is co-owned by the ~~client~~ parent and ~~server~~child actors. If the receiving actor does not already co-own ''o'', it does after receiving the message. A lower layer needs to enforce that this is implemented correctly:

# ''o'' lives in shared memory

# all objects reachable from ''o'' live in shared memory

# all accesses to members of ''o'' are synchronized across the client and server

'''transfer''' means that the sending actor owns ''o'', and when the receiving actor receives ''o'', ownership transfers from the sender to the receiver. This means that requirement (3) above is removed for '''transfer''' types. ~~No SharingQualifier (\epsilon) means that that object sent is serialized~~. ~~How we classes are serialized~~ This is the preferred sharing semantics; '''share''' probably ~~not a concern of the protocol layer, but very important nonetheless. This is likely to~~ won't be ~~another security concern~~implemented initially.

'''NOTE''': '''share''' and '''transfer''' are optimizations. These don't need to be included in the initial language implementation, but are worth keeping in mind.

A BasicType is a C++ type that can be transferred in a message. We will provide a set of BuiltinTypes like void, int, and so on. Protocol writers can also ''import'~~NOTE~~'~~'': what <code>BasicType<~~foreign types for which marshall/unmarshall traits are defined, and/~~code> means should~~ or that can be ~~a fruitful topic for discussion~~allocated in shared memory.

=== Strawman transition grammar ===

Transition :: 'state' StateName '{' Actions '}'

Actions :: (Action ';' )* Action? Action :: MessageAction | ~~\epsilon)~~RPCAction

~~Action~~ MessageAction :: ('send' | 'rcv') MessageName 'goto' StateName RPCAction :: ('!call' | '?answer') MessageName ('->push' StateName)? '''TODO''': the above grammar may lead to unnecessarily verbose specifications, since there's only one "action" permitted per state transition. We can add additional syntax to reduce verbosity if it becomes a problem.

~~This is a dirt-simple grammar but should capture all we need in a first pass.~~ A transition starts from a particular state (the lexically first state is the start state), and then either sends or receives ("calls" or "answers" for RPC) one of a set of allowed messages. The syntax <code>send MessageName !</code> means "send MessageName", and <code>rcv MessageName ?</code> means "receive MessageName". ~~The~~ After the action, the actor then transitions into another state. For RPC, an action causes the current state to be pushed on a stack, then the "push STATE" to be transitioned into.

~~From~~ Unfortunately, the syntax for async/sync messages and RPC calls diverge because the semantics are so different. Sync/async messages only model message passing, whereas RPC models function calls. After a ~~particular state~~message-passing action occurs, an the actor ~~can either ''~~only ~~receive'' or ''only send'' messages~~ makes a state transition (~~we could relax this~~<code>goto STATE</code>). However, ~~but it complicates the implementation~~an RPC action pushes a new state onto an "RPC stack" (<code>push STATE</code>). ~~This is extremely easy to check statically (we could make it part of~~ When the ~~grammar~~RPC call returns, ~~too)~~the "pushed" state is "popped."

Transitions only happen when the underlying message operation was "completed." For messages sent asynchronously, this means sent over the wire (resp., received). For messages sent synchronously, this means sent over the wire ''~~and~~'TODO''' ~~replied to by the other side (resp., received and reply sent)~~: this may be confusing. Any ideas for simplifying it?

We can ~~support sending/receiving multiple messages per transition~~check almost any property of the protocol specification itself statically, since it's a state machine + well-defined stack. As What all of these static invariants should be is not yet known; one invariant is that an asynchronous message can't be nested within a synchronous one. From this ~~complicates~~ static specification, we can generate a C++ dynamic checker to ensure that message processors (code utilizing actors) adhere to the ~~implementation~~protocol spec. We may be able to check some of this statically as well, but it's ~~probably best to add that only when necessary~~harder.

'''TODO''': there are ~~many~~ more things we can integrate ~~here~~into the transition grammar, but concrete use cases are necessary~~. This should be a main point of discussion~~.

== Implementation ==

Cgj

Confirm

699

edits

Changes

IPC Protocols

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

How to Contribute

MozillaWiki

Around Mozilla

Tools