Some things we might eventually do to exploit multiple CPUs.
I think it's imperative that we not expose a multithreaded model to DOM/JS programmers. Then DOM trees can only be changed synchronously on the "main thread" that runs all JS. (JS/DOM islands that we know can never communicate directly could be run on their own threads but perhaps we should just spawn new Gecko processes in that case, so then they become crashproof as well.)
- Put each parser on its own thread? The thread could receive incoming Necko data, apply content transfer decoding and charset conversion, then parse into a set of DOM elements, which are then passed to the main thread in an event for insertion into the real DOM.
- Image decoding on dedicated threads.
- Reflow. Reflow has to block the main thread because we can't have outside DOM mutation during reflow. We can make reflow read-only for the DOM (currently we change a few attributes, but that's a mistake) and it's possible to imagine reflowing some frame subtrees in parallel --- if we can ensure all the reflow helper services are thread-safe, but the overhead of that might hurt single-CPU performance, unless we're really clever. There may be other reflow spot fixes we can make:
- A lot of reflow time is measuring text. We have DOM text nodes and style data at the beginning of reflow, so we might be able to precompute and cache text metrics in helper threads.
- Rendering. Rendering has to block the main thread because we can't have DOM or frame tree mutation during rendering. Rendering is read-only for the DOM and frame trees and can definitely be parallelized.
- User:Shaver suggested parallelizing across time so different threads compute different animation frames. That'd be hard because the current animation state is really part of the DOM.
- Having multiple threads doing actual drawing could be problematic because they'll end up contending on the graphics hardware (or its proxy, such as a display server) and/or forcing it to context switch frequently.
- The Linux solution of cairo sending an XRender command stream to Xglx gives some CPU parallelism without GPU contention, maybe we could emulate it on Windows/Mac.
- If we can have multiple drawing threads, then we could parallelize spatially (each thread draws a region of the screen and traverses the whole frame tree) or (probably better) in Z order, so each thread traverses a subtree of the frame tree, draws it into an RGBA buffer, and then we composite the buffers together.
- We could use helper threads to precompute tesselations or use Freetype to rasterize font glyphs for later consumption.
One big question is what our heavy CPU workloads will be in the next five years. Will today's pageloads become less of an issue (because pages don't get more complex) or more of an issue (because of increased network bandwidth)? Will CPU time be dominated by rendering complex animations or running 3D physics or really complex layouts or really complex scripts?