JaegerMonkey: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 1: Line 1:
<blockquote>
<blockquote>This is the coder's badge of glory, That he protect and tend his monkey, Code with honor, as is due, And through the bits to God is true. --damons, IRC </blockquote>
This is the coder's badge of glory, That he protect and tend his monkey, Code with honor, as is due, And through the bits to God is true.
JaegerMonkey (or JägerMonkey) is '''inline threading''' for SpiderMonkey. The goal is to get reliable baseline performance on the order of other JS JIT systems. "Inline threading" really just means a baseline whole-method JIT that doesn't necessarily do many traditional compiler optimizations. Instead, it does dynamic-language-JIT-oriented optimizations like PICs and specialization of constant operands.<br>


--damons, IRC
[https://bugzilla.mozilla.org/show_bug.cgi?id=536277 Bug 536277] is the meta bug for this project.  
</blockquote>
JaegerMonkey (or JägerMonkey) is '''inline threading''' for SpiderMonkey. The goal is to get reliable baseline performance on the order of other JS JIT systems. "Inline threading" really just means a baseline whole-method JIT that doesn't necessarily do many traditional compiler optimizations. Instead, it does dynamic-language-JIT-oriented optimizations like PICs and specialization of constant operands.
 
The rest of this wiki page presents our initial development plan.  


[https://bugzilla.mozilla.org/show_bug.cgi?id=536277 Bug 536277] is the meta bug for this project.
= Planning =


= First Deliverable  =
== First Deliverable  ==


An inline/call-threaded version of TraceMonkey where:  
An inline/call-threaded version of TraceMonkey where:  
Line 20: Line 16:
Once this is in place, we can then make it faster and faster by adding more optimizations.  
Once this is in place, we can then make it faster and faster by adding more optimizations.  


= Status =
== Status ==
 
We have imported the Nitro assembler and verified that it works with a basic test harness and the beginnings of the compiler.


We have imported the Nitro assembler and verified that it works with a basic test harness and the beginnings of the compiler.
We have almost finished the JS stack cleanup and simplification. See [https://bugzilla.mozilla.org/show_bug.cgi?id=536275 Bug 536275].  


We have almost finished the JS stack cleanup and simplification. See [https://bugzilla.mozilla.org/show_bug.cgi?id=536275 Bug 536275].
Work has begun on the compiler. See [https://bugzilla.mozilla.org/show_bug.cgi?id=543637 Bug 543637]  


Work has begun on the compiler. See [https://bugzilla.mozilla.org/show_bug.cgi?id=543637 Bug 543637]
= Design Discussion =


= Design Decisions  =
== Initial Design Decisions  ==


The general idea is to do something along the lines of dmandelin's first prototype, Sully's prototype, and Nitro. The important design specifics that we're planning to go with for now are:  
The general idea is to do something along the lines of dmandelin's first prototype, Sully's prototype, and Nitro. The important design specifics that we're planning to go with for now are:  
Line 58: Line 56:
The layout of the unboxed stack will be the same in the interpreter or on trace. To get this, we mostly have to delete or move out of band the extra fields in JSStackFrame. We will need to reorder a bit too. Oncewe have that, to enter trace, we do no work, and to leave trace, we just memcpy typemaps into the interpreter type tags stack.  
The layout of the unboxed stack will be the same in the interpreter or on trace. To get this, we mostly have to delete or move out of band the extra fields in JSStackFrame. We will need to reorder a bit too. Oncewe have that, to enter trace, we do no work, and to leave trace, we just memcpy typemaps into the interpreter type tags stack.  


= Planned Optimizations  =
== Planned Optimizations  ==


#Fast calls to stub functions. This is based on a trick that Nitro uses. The idea is that stub functions logically have an array parameter or several parameters, which include input jsvals and also interpreter stuff like the sp, fp, cx, etc. Much of this is constant so the call can be made fast by setting up an area in the C stack with all the arguments filled in. To make a call, we just have to store the input jsvals and do a call instruction.  
#Fast calls to stub functions. This is based on a trick that Nitro uses. The idea is that stub functions logically have an array parameter or several parameters, which include input jsvals and also interpreter stuff like the sp, fp, cx, etc. Much of this is constant so the call can be made fast by setting up an area in the C stack with all the arguments filled in. To make a call, we just have to store the input jsvals and do a call instruction.  
Line 64: Line 62:
#PIC. This is really a subset of item 2. In fact, "PIC" is a bit wrong, because as Andreas pointed out, we can start by inlining fast paths that access/guard against the property cache.  
#PIC. This is really a subset of item 2. In fact, "PIC" is a bit wrong, because as Andreas pointed out, we can start by inlining fast paths that access/guard against the property cache.  
#Eliminate PC update. In an inline-threaded interpreter, we don't need to update the PC, because EIP encodes that. To enable this, we have to make sure no ops snoop the PC. We also need to help the GC/decompiler by making sure we have some way to provide them a PC (using a mapping or something) on demand.  
#Eliminate PC update. In an inline-threaded interpreter, we don't need to update the PC, because EIP encodes that. To enable this, we have to make sure no ops snoop the PC. We also need to help the GC/decompiler by making sure we have some way to provide them a PC (using a mapping or something) on demand.  
#Eliminate SP update. Inside basic blocks of JSOPs, we shouldn't need to keep a proper stack. Instead, we can teach the compiler to track which logical stack element is in which register and generate faster code.
#Eliminate SP update. Inside basic blocks of JSOPs, we shouldn't need to keep a proper stack. Instead, we can teach the compiler to track which logical stack element is in which register and generate faster code.  
#Fast closures. This is important for advanced web apps as well as Dromaeo and the V8 benchmarks. See [https://bugzilla.mozilla.org/show_bug.cgi?id=517164 bug 517164].
#Fast closures. This is important for advanced web apps as well as Dromaeo and the V8 benchmarks. See [https://bugzilla.mozilla.org/show_bug.cgi?id=517164 bug 517164].
313

edits

Navigation menu