IonMonkey/MIR: Difference between revisions
(Created page with "IonMonkey's MIR (Middle-level IR) is an SSA IR. IR nodes are represented as objects, whose class hierarchy is in <tt>ion/MIR.h</tt> File:ionmonkey_mir.png") |
No edit summary |
||
| Line 1: | Line 1: | ||
IonMonkey's MIR (Middle-level IR) is an SSA IR. IR nodes are represented as objects, | IonMonkey's MIR (Middle-level IR) is an SSA IR. IR nodes are represented as objects, and organized into basic blocks, which comprise a control-flow graph. | ||
=Control-Flow Graphs= | |||
''Found in: <tt>ion/MIRGraph.*</tt> | |||
The control-flow graph for MIR is represented via basic blocks (<tt>MBasicBlock</tt>). Basic blocks are collected in a <tt>MIRGraph</tt>, in reverse postorder. | |||
Each basic block has the following information: | |||
*List of SSA phis. | |||
*List of MIR instructions. | |||
*List of predecessors. | |||
*A "last" instruction, which must be a control instruction. | |||
*List of successors. | |||
*Dominator information. | |||
*An entry "snapshot" (described later). | |||
During MIR building, an MBasicBlock also models the interpreter stack, mapping local variables and stack slots to SSA names. | |||
=MIR Nodes= | |||
''Found in: <tt>ion/MIR.*</tt> | |||
MIR objects are organized into the following class hierarchy: | |||
[[File:ionmonkey_mir.png]] | [[File:ionmonkey_mir.png]] | ||
IR nodes are separated into definitions (which provide an SSA name) and snapshots, which are purely informative and described later. | |||
*'''MDefinition''': A node which provides an SSA name. | |||
*'''MPhi''': A pseudo-instruction representing an SSA phi. Phis are placed at the top of basic blocks. They have ''n'' inputs, where ''n'' is the number of predecessors to its containing block. | |||
*'''MInstruction''': An actual instruction that either produces a value, has an effect, or guards on a condition. Instructions may be added to basic blocks. | |||
Definitions have the following information: | |||
*An opcode, describing which class the IR node can be cast to. | |||
*A list of input operands. | |||
*A return type, if the definition returns anything. | |||
*A list of IR nodes which use this definition. Def-use chains are created automatically when adding input operands. | |||
*An instruction ID, assigned incrementally in reverse postorder. | |||
*A snapshot, described later. | |||
=Building MIR= | |||
''Found in: <tt>ion/IonBuilder.*</tt>'' | |||
==Parsing Bytecode== | |||
MIR is currently built directly off the bytecode in a <tt>JSScript</tt>. The entire bytecode is traversed in a single pass. The builder uses abstract interpretation, modelling the stack transitions of each bytecode to break it down into SSA. | |||
The bytecode is traversed as if it were an AST. For example, both arms of an "if" are read before proceeding at the join point. (Note, we can do this in one pass thanks to SpiderMonkey's source notes.) However, join blocks are created ahead of time, and its predecessors are added incrementally. This allows us to lazily place phis, which is necessary when two predecessors have different definitions for the same slot. | |||
Backedges need to be handled specially. If the definition of a slot differ between a loop header and its backedge, a phi is created at the header. Any uses of the original slot in the header, that are dominated by the header, are replaced with the phi. (Also see, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4503&rep=rep1&type=pdf "Single Pass Generation of SSA Form for Structured Languages"]). | |||
For example, here is a diagram of MIR generation for the following script. Red indicates key control flow triggers in the bytecode. | |||
<pre> | |||
function f(a) { | |||
var x; | |||
if (a) | |||
x = 5; | |||
else | |||
x = 10; | |||
return x; | |||
}</pre> | |||
[[File:ionmonkey_building_mir.png]] | |||
Timeline: | |||
* The initial block is built, until the builder reaches IFEQ. | |||
* The builder creates blocks for the "true" arm (5-10), the "false" arm (14-19), and the join position, which starts at 20. It then pushes an "if" indicator onto an internal traversal stack, which describes both arms. | |||
* The builder traverses the "true" arm, until it reaches position 11. | |||
* The builder traverses the "false" arm, until it reaches position 19. | |||
* The builder pops the traversal stack, and resumes building MIR into the join block at position 20. | |||
'''Note 1:''' Certain control-flow triggers will automatically terminate traversal for the current control structure. These are "break", "continue", and "return". Bytecode after these are automatically determined to be not reachable. | |||
'''Note 2:''' Copy or constant propagation is NOT ALLOWED during MIR generation. This would break phi replacement, for reasons described in <tt>MIRGraph.cpp</tt>. | |||
==Snapshots== | |||
Snapshots are special IR nodes whose function is at the heart of IonMonkey. When deoptimization occurs, it is necessary that we resume the JavaScript program in another execution mode. This means translating the highly optimized Ion state back into the interpreter. Snapshots capture the information necessary to make this happen. | |||
For a given point in the program, MSnapshots save the SSA names corresponding to each local variable and slot in the expression stack. Snapshots can be taken anywhere, but minimally, '''MUST''' be taken at these critical points: | |||
*At the start of basic blocks. Execution cannot resume before an already-taken control keyword. | |||
*After non-idempotent instructions. For example, calls, or modifications to the heap, are both observable. | |||
This is because when deoptimization occurs, we resume the JavaScript program at the most recently-taken snapshot. If this re-executes idempotent code, that is okay. If it re-executes an observable operation, it could lead to a correctness bug. | |||
Snapshots participate in def-use chains, but are not actual definitions or instructions. Rather, they are attached to instructions or basic blocks. When attached to instructions, they are meant to be captured ''after'' the instruction. '''This means an MInstruction's effects must be atomic.''' There is no facility to resume in the "middle" of an instruction. | |||
Revision as of 05:26, 15 August 2011
IonMonkey's MIR (Middle-level IR) is an SSA IR. IR nodes are represented as objects, and organized into basic blocks, which comprise a control-flow graph.
Control-Flow Graphs
Found in: ion/MIRGraph.*
The control-flow graph for MIR is represented via basic blocks (MBasicBlock). Basic blocks are collected in a MIRGraph, in reverse postorder.
Each basic block has the following information:
- List of SSA phis.
- List of MIR instructions.
- List of predecessors.
- A "last" instruction, which must be a control instruction.
- List of successors.
- Dominator information.
- An entry "snapshot" (described later).
During MIR building, an MBasicBlock also models the interpreter stack, mapping local variables and stack slots to SSA names.
MIR Nodes
Found in: ion/MIR.*
MIR objects are organized into the following class hierarchy:
IR nodes are separated into definitions (which provide an SSA name) and snapshots, which are purely informative and described later.
- MDefinition: A node which provides an SSA name.
- MPhi: A pseudo-instruction representing an SSA phi. Phis are placed at the top of basic blocks. They have n inputs, where n is the number of predecessors to its containing block.
- MInstruction: An actual instruction that either produces a value, has an effect, or guards on a condition. Instructions may be added to basic blocks.
Definitions have the following information:
- An opcode, describing which class the IR node can be cast to.
- A list of input operands.
- A return type, if the definition returns anything.
- A list of IR nodes which use this definition. Def-use chains are created automatically when adding input operands.
- An instruction ID, assigned incrementally in reverse postorder.
- A snapshot, described later.
Building MIR
Found in: ion/IonBuilder.*
Parsing Bytecode
MIR is currently built directly off the bytecode in a JSScript. The entire bytecode is traversed in a single pass. The builder uses abstract interpretation, modelling the stack transitions of each bytecode to break it down into SSA.
The bytecode is traversed as if it were an AST. For example, both arms of an "if" are read before proceeding at the join point. (Note, we can do this in one pass thanks to SpiderMonkey's source notes.) However, join blocks are created ahead of time, and its predecessors are added incrementally. This allows us to lazily place phis, which is necessary when two predecessors have different definitions for the same slot.
Backedges need to be handled specially. If the definition of a slot differ between a loop header and its backedge, a phi is created at the header. Any uses of the original slot in the header, that are dominated by the header, are replaced with the phi. (Also see, "Single Pass Generation of SSA Form for Structured Languages").
For example, here is a diagram of MIR generation for the following script. Red indicates key control flow triggers in the bytecode.
function f(a) {
var x;
if (a)
x = 5;
else
x = 10;
return x;
}
Timeline:
- The initial block is built, until the builder reaches IFEQ.
- The builder creates blocks for the "true" arm (5-10), the "false" arm (14-19), and the join position, which starts at 20. It then pushes an "if" indicator onto an internal traversal stack, which describes both arms.
- The builder traverses the "true" arm, until it reaches position 11.
- The builder traverses the "false" arm, until it reaches position 19.
- The builder pops the traversal stack, and resumes building MIR into the join block at position 20.
Note 1: Certain control-flow triggers will automatically terminate traversal for the current control structure. These are "break", "continue", and "return". Bytecode after these are automatically determined to be not reachable.
Note 2: Copy or constant propagation is NOT ALLOWED during MIR generation. This would break phi replacement, for reasons described in MIRGraph.cpp.
Snapshots
Snapshots are special IR nodes whose function is at the heart of IonMonkey. When deoptimization occurs, it is necessary that we resume the JavaScript program in another execution mode. This means translating the highly optimized Ion state back into the interpreter. Snapshots capture the information necessary to make this happen.
For a given point in the program, MSnapshots save the SSA names corresponding to each local variable and slot in the expression stack. Snapshots can be taken anywhere, but minimally, MUST be taken at these critical points:
- At the start of basic blocks. Execution cannot resume before an already-taken control keyword.
- After non-idempotent instructions. For example, calls, or modifications to the heap, are both observable.
This is because when deoptimization occurs, we resume the JavaScript program at the most recently-taken snapshot. If this re-executes idempotent code, that is okay. If it re-executes an observable operation, it could lead to a correctness bug.
Snapshots participate in def-use chains, but are not actual definitions or instructions. Rather, they are attached to instructions or basic blocks. When attached to instructions, they are meant to be captured after the instruction. This means an MInstruction's effects must be atomic. There is no facility to resume in the "middle" of an instruction.

