From MozillaWiki
Jump to: navigation, search




As mentioned elsewhere, the graphics systems is responsible for rendering (painting, drawing) the frame tree (rendering tree) elements as created by the layout system. Each leaf in the tree has content, either bounded by a rectangle, or perhaps another shape, in the case of SVG.

The simple approach for producing the result would thus involve traversing the frame tree, in a correct order, drawing each frame into the resulting buffer and displaying (printing non-withstanding) that buffer when the traversal is done. It is worth spending some time on the "correct order" note above. If there are no overlapping frames, this is fairly simple - any order will do, as long as there is no background. If there is background, we just have to worry about drawing that first. Since we do not control the content, chances are the page is more complicated. There are overlapping frames, likely with transparency, so we need to make sure the elements are draw "back to front", in layers, so to speak. Layers are an important concept, and we will revisit them shortly, as they are central to fixing a major issue with the above simple approach - performance.

While the above simple approach will work, the performance will suffer. Each time anything changes in any of the frames, everything needs to be repeated, everything needs to be redrawn. Further, we will probably not be taking advantage of the modern graphics (GPU) hardware, or multi-core computers. Finally, since the frame tree is only accessible from the UI thread, the UI is basically blocked while we're doing this work.


** Make the layer tree
** Fill in the leaves of the layer tree content
*** Using Azure/Moz2D API
**** On Android
**** On B2G
**** On Linux
**** On OSX
***** Cairo
***** CoreGraphics (Quartz)
**** On Windows XP
**** On Windows Vista/7
**** On Windows 8
***** Cairo
***** Skia
***** D2D 1.0
***** D2D 1.1
*** Using Thebes API
** Fill in the leaves of the layer tree canvas
*** Using Azure/Moz2D API
*** Using Thebes API
** Composition
*** On the main thread, using the original API
*** Using OMTC
**** On a separate thread
**** Still using the same thread


* TODO: layer boundaries, paint flashing

Getting the data

* TODO: on demand image requests, multi-threaded image loading, off main thread image animation


* TODO: region invalidation, video, streaming buffers

(Retained) Layers

Layers framework was introduced to address the above performance issues, by having a part of the design address each item. At the high level:

  1. We create a layer tree. The leaf elements of the tree contain all frames (possibly multiple frames per leaf).
  2. We render each layer tree element and cache (retain) the result.
  3. We composite (combine) all the leaf elements into the final result.

Let's examine each of these steps. We will do it in reverse order, as the reasoning for some of the work in earlier stages is what will benefit in the later stages.


We use the term composite as it implies that the order is important. If the elements being "composited" overlap, whether there is transparency involved or not, the order in which they are combined will effect the result. Compositing is where we can use some of the power of the modern graphics hardware. It is optimal for doing this job. In the scenarios where only the position of individual frames changes, without the content inside them changing, we see why caching each layer would be advantageous - we only need to repeat the final compositing step, completely skipping the layer tree creation and the rendering of each leaf, thus speeding up the process considerably.

Another benefit is equally apparent in the context of the stated deficiencies of the simple approach. We can use the available graphics hardware accelerated APIs to do the compositing step. Direct3D, OpenGL can be used on different platforms and are well suited to accelerate this step.

Finally, we can envision performing the compositing step on a separate thread, unblocking the UI thread for other work, and doing more work in parallel. More on this below.

It is important to note that the number of operations in this step is proportional to the number of layer tree (leaf) elements, so there is additional work and complexity involved, when the layer tree is large.

Render and retain layer elements

As was mentioned, the compositing step could benefit from the caching of the intermediate results for each layer tree element. This does result in the extra memory usage, so needs to be considered during the layer tree creation. Beyond the caching, we can accelerate the rendering of each element by (indirectly) using the available platform APIs (e.g., Direct2D, CoreGraphics, even some of the 3D APIs like OpenGL or Direct3D) as available. This is done through a platform independent API (see Moz2D below), but is important to realize it does get accelerated appropriately, depending on the platform.

Creating the layer tree

We need to create a layer tree (from the frames tree), which will give us the correct result while striking the right balance between a layer per frame element and a single layer for the complete frames tree. As was mentioned above, there is an overhead in traversing the whole tree and caching each of the elements, balanced by the performance improvements. Some of the performance improvements are only noticed when something changes (e.g., if the only change is one element moving, we would only need to redo the compositing step.)

On the platforms that support OMTC (more on that below), you can visualize this layer breakdown using a preference. Setting "layers.draw-borders" to true (in about:config), will cause the borders of layers and tiles to be displayed on top of the content. On platforms where OMTC is not used, this preference has no effect.

Layer (element) types


Graphics API - Moz 2D

Moz2D is a cross platform graphics API designed and implemented as a part of the (ongoing) Azure project. The API was designed as stateless, close to both platform APIs and the hardware. Adding a new graphics platform to Gecko is reduced to implementing a "back end" for Moz2D. As of today, we have a number of supported and experimental back ends: D2D 1.0, D2D 1.1, Direct3D, Skia, SkiaGL, NV Path, possibly others.

Sidenote: the Gecko graphics API preceding Moz2D was "Thebes" and you will still find references in the code to it. Once the Azure project is completed, all the back ends implemented and all the calling code converted to use the new Moz2D API, we will be removing the old "Thebes" API. For historical reasons, there are some classes with Thebes in the name which will be surviving this process, as they're not actually part of the graphics API. They may get renamed eventually, to reduce the number of magical words you need to know about, but be warned - just because it has Thebes in the name, it does not mean that the code is on its way out.

Stateless or stateful?


Thebes API was a C++ wrapper to the Cairo graphics library. Cairo uses a "stateful context" model much like Postscript or Quartz 2D. To draw content, you make individual API calls to set up various bits of state in a context object, followed by another API call to actually perform drawing. For example, to stroke a shape with a dashed line, you would typically set the color, set the line width, set the dashing style, start a new path, emit path segments, and finally draw the stroke --- all as separate API calls. In Cairo, even drawing an image requires the caller to set the source surface, emit a rectangle path, and fill --- at least 6 API calls. Cairo uses a stateless surface API internally; whether the underlying API is stateful or stateless, there’s always a conversion from one to the other. Further, the HTML5 Canvas element is a stateful API that’s built on top of Cairo, and it has to have its own state tracking too.

On OS X, Cairo uses its Quartz back end for all drawing. Quartz can be fast: Safari is faster than Firefox on some demos, even though we both use the same back end for drawing. We believe one reason this is the case is because Quartz is stateful: Cairo needs to convert from its stateful API to its internal stateless surface API, then back to Quartz’s stateful API. Moz2D was designed to map from stateless to stateful in a more optimal, and optimizable, way.


Almost all the operations on an Moz2D DrawTarget (the nearest equivalent to a drawing context) do actual drawing and take most relevant state as parameters. The only state carried by a DrawTarget is the destination surface itself (of course) plus a current transform and a current clip stack. We let the transform and clip state remain in the DrawTarget because those are the only pieces of state not constantly reset by CSS rendering. Our CSS rendering needs to always render under some given transform and clip, and we don't want all our rendering code to have to pass those around everywhere.

Where are we with Moz2D?

We still have Thebes API actively used. We have Moz2D API being used directly. The general approach was to design the Moz2D API, implement it with a particular back end and start moving code to use it. In the meantime, Thebes wrapper for Moz2D was also created, so that the code that uses Thebes eventually ends up in Moz2D - it is then just a question of replacing the "get to Moz2D through Thebes" by direct interaction with Moz2D. Moving from stateful to stateless API is the more difficult direction.

OMT Everything

You will see a lot of references to OMTC, OMTA, OMTP, etc. The common is "OMT", and it stands for "off main thread". You can guess what this means - the "main" thread is the one running the UI, getting events, running JavaScript, etc., and it is usually busy enough figuring out what your page needs to look like. Actually producing the rendering of the page should be moved to a different thread, perhaps even made multi-threaded, as much as possible. Not only will that leave more time in the main thread for the operations that need to remain there, but it will also allow us to do things in parallel, and do more work in the same amount of time. So, smoother and faster, at least in the complex cases. There may be some slowdown in very simple cases, and you are most likely using more memory all the time.


The "C" in OMTC stands for Compositing. It refers to the scenario where the compositing step is done in a separate thread, away from the main thread, away from the thread that is creating the content of each layer element. Whether the content thread and the UI thread are the same is irrelevant at this point, as is whether there is only a single content thread or multiple ones.

The original design for compositing did not account for separate threads. The change to OMTC was introduced behind a preference, at different times to different platforms. As such, depending on when you look at the code, you may see OMTC at different level of completeness for different platform; it may not exist for a particular platform at all, and even if it does, it may be off by default. While this is important to know in order to make sense of the existing code, it is even more important to understand that the goal is to move to OMTC enabled code on all platforms. Since there may be platforms where OMTC is not faster than compositing on the main thread would be (e.g., no hardware acceleration at all), we are currently maintaining both the OMTC and non-OMTC code paths. This is a burden, increases code complexity and chance of bugs and the goal is to simplify this. We will end up in a scenario where all the platforms are going through the OMTC code path - even if we then decide to have the compositing thread actually be the same as the main thread. This may lead to some performance regressions, so it will not be done without careful consideration.

You may be asking yourself "how can we talk about off main thread compositing being on the main thread?" and you would have a point. In order to run the compositor on the separate thread, we need to put a number of mechanisms in place. At a minimum, we need to have a copy of the layer tree (one for each thread) and a way to synchronize them as efficiently as possible. However, once we have that mechanism in place, we don't actually have to run the compositor on the separate thread. Just that we can.

Moz2D - canvas, content?
  • Android: canvas (), content ()
  • Firefox OS: canvas (), content ()
  • Linux: canvas (), content ()
  • OS X: canvas (), content ()
  • Windows XP: canvas (), content ()
  • Windows 7: canvas (), content ()
  • Windows 8/Metro: canvas (), content ()
  • Moz2D Skia
  • Moz2D SkiaGL
  • Moz2D CoreGraphics (OSX)
  • Moz2D Cairo
  • Moz2D Direct2D 1.0
  • Moz2D Direct2D 1.1
  • Moz2D OpenGL
  • Moz2D Direct3D

  • OMTC Android
  • OMTC FirefoxOS
  • OMTC OS X (Quartz, sw)
  • OMTC Metro (D3D11)
  • OMTC Windows (D3D9, D3D11, sw)
  • OMTC Linux (
  • OMTC OpenGL
  • OMTC SkiaGL

Texture transfer & ownership details

This is about the right time to dig into some of the details.

New layers architecture, with OGL backend

There are a lot of things to consider.

  • We need a copy of a layer tree on both the content and the compositing thread. This is what we call "shadow layers". The terminology refers to the fact that one tree is a shadow of another, in that it should have the same shape and content, after the synchronization is completed.
  • We need a way to synchronize these two trees. To perform this synchronization, we use the IPDL IPC framework. IPDL lets us describe communications protocols and actors in a domain specific language. for more information about IPDL, read
  • IPDL has a concept of "host" and "client", as you may imagine, which leads to a naming convention inside of the OMTC code, and layers system in particular. The "host" refers to the layer tree on the compositor thread, and the "client" refers to the layer tree on the content thread. This is important to remember, as most the C++ classes implementing the layers use "host" or "client" in their names.
The details

When the client layer tree is modified, we record modifications into messages that are sent together to the host side within a transaction. A transaction is basically a list of editions to the layer tree that have to be applied all at once to preserve consistency in the state of the host layer tree.

Texture transfer is done by synchronizing texture objects across processes/threads using a couple TextureClient/TextureHost that wrap shared data and provide access to it on both sides. TextureHosts provide access to one or several TextureSource, which has the necessary API to actually composite the texture (TextureClient/Host being more about IPC synchronization than actual compositing. The logic behind texture transfer (as in single/double/triple buffering, texture tiling, etc.) is in the CompositableClient and CompositableHost classes.

It is important to understand the separation between layers and compositables. Compositables handle all the logic around texture transfer, while layers define the shape of the layer tree. A compositable is created independently from a layer and attached to it on both trees. While layers transactions can only originate in the content (client) thread, this separation make it possible for us to have separate compositable transactions between any thread and the compositor thread. The ImageBridge IPDL protocol is used to that end. The idea of ImageBridge, is to create a Compositable that is manipulated on the ImageBridgeThread, and that can transfer/synchronize textures without using the content thread at all. This is very useful for smooth Video compositing: Video frames are decoded and passed into the ImageBridge with ever touching the content thread, which could be busy processing reflows or heavy javascript workloads.

It is worth reading the inline code documentation in the following files:

More on compositing

The above details give you an idea as to why the work needs to be repeated on each platform. The communication between threads or processes, the most efficient way to transfer the information between threads (or processes) all come into the picture.

* TODO This is where the "where are we with OMTC" should go

This is the status of off main thread compositing on different platforms

Android FirefoxOS Linux Mac Windows
  • Item
  • Item
  • Item
  • Item
  • [ON TRACK] Vista/7
  • [ON TRACK] 8
  • [ON TRACK] Vista/7
  • [ON TRACK] 8
Basic (sw)
  • [DONE] XP
  • [DONE] Vista/7
  • [DONE] 8

Image Decoding

Image Animation

Future musings

GL over IPC


Funny words

There are a lot of code words that we use to refer to projects, libraries, areas of the code. Here's an attempt to clear up some of the confusion.

  • Azure - See Moz2D in the Graphics API section above.
  • Backend - See Moz2D in the Graphics API section above.
  • Cairo - Cairo is a 2D graphics library with support for multiple output devices. Currently supported output targets include the X Window System (via both Xlib and XCB), Quartz, Win32, image buffers, PostScript, PDF, and SVG file output.
  • Moz2D - See Moz2D in the Graphics API section above.
  • Thebes - Graphics API that preceded Moz2D.

Historical Documents

A number of posts and blogs that will give you more details or more background, or reasoning that led to different solutions and approaches.