Platform/GFX/textures
TextureClient & TextureHost
TextureClient and TextureHost are the way to share texture memory across threads or processes.
For more information read the inline documentation
- http://dxr.mozilla.org/mozilla-central/source/gfx/layers/client/TextureClient.h
- http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/TextureHost.h
In order to efficiently render web content we need to be able to paint content in a surface that can be shared and composited directly. It is important to avoid copies, so we often face situations where the surface has to be shared with several threads living in different processes at the same time. As a result managing the lifetime of the shared data can become tricky. Our shared surfaces need cross-process reference counting and locking semantics. The TextureClient and TextureHost abstraction offer exactly that., where TextureClient is the view on the surface for the client process and TextureHost the view on the surface for the compositor process. There is currently no common interface between TextureClient and TextureHost (They are both reference counted and offer locking semantics, but they don't share a common "SharedTextureObject" base class), because their is currently no code that is run both on the client and host sides. This tends to confuse some people so by popular demand we'll add a common base class (this could also help us implement main-thread compositing using the Compositor API).
Very important: TextureClient and TextureHost are *NOT* the equivalent of SurfaceTexture. they don't have a swap chain, they don't encapsulate several buffers, only one. TextureClient and TextureHost are the equivalent of android::GraphicBuffer. Except that Gecko needs to work with GraphicBuffer, and EGLImage, and shmem, and D3D textures, etc. So TextureClient and TextureHost provide an abstraction that unifies textures backed by gralloc, shmem, etc. If you need to implement a swap chain, the swap chain should manipulate several TextureClient/Host, and not be implemented insede TextureClient.
Also very important: TextureClient is *NOT* an implementation detail of gfx/layers/ that should be hidden from the rest of Gecko. Just like android::GraphicBuffer is not an implementation detail of SurfaceFlinger. If you are producing content that needs to be shared with the compositor process without copy, then you should be using one or several TextureClients.
Almost as important: The two important items above are really important, so read them several times.
New vs Deprecated textures
Deprecated texture clients and host suffer from a mix of imperfect design decisions (design decisions that did not work well in the specific case of B2G's memory model coupled with some of the funky sharing in Gecko) and the pile of quick hacks that had to be done to keep them around. As a result, deprecated texture clients and host are hard to reason about and bug prone (especially on B2G).
In order to incrementally fix the badness of TextureClient and TextureHost, the classes have been marked deprecated and new texture clients and host were designed, using a more strict and defined memory model. The goal is now to replace all the usage of DeprecatedTexture* by the new classes. When writing new code, if possible please use the new classes.
The biggest difference between deprecated and new textures is that deprecated did not own any shared data. Look at them as channels through which several SurfaceDescriptors were sent from one side to the other. This caused ownership problems because Surface descriptor doesn't have any notion of ownership (it is just an IPDL generated structure for serialization). New textures on the other hand fully own their shared data. There should be no object referring to data shared with the compositor that is not doing so through a TextureClient (or TextureHost). This is important because TextureClient and TextureHost define a strict ownership protocol that is designed to cover all the (numerous) use cases in Gecko. So there is always one and only one TextureClient/Host pair per shared buffer. If there is need to send a new buffer, then a new TextureClient/Host pair is created along with it. This way it is possible to track the lifetime of buffers that are shared between several threads on the content side, plus the compositor thread (which lives in a separate process), which was not possible with the derpecated texture (the latter supposed that we could get away with loosing ownership of the buffers when sending them accross IPC, which turned out to be incompatible with the way Gecko works for some things like video and buffer rotation).
Why is sharing textures between the content side and the compositor side so complicated?
That is a valid question. If you read code from other open-source compositors and compositor protocols such as SurfaceFlinger and Wayland, you will notice that they have simpler memory ownership models. In the case of SurfaceFlinger for instance, you have a simple producer/consumer model where ownership is transfered from one side to the other along with the data. Why not do the same in Gecko? Well, we tried and that's what the deprecated textures should have been. It turned out that the way Gecko works, we need to shared textures with the compositor while still being able to read it from *any* thread on the content side.
Example 1: the video pipeline. When a video frame is produced we try to put it in shared memory as early as possible to avoid making too many copies of the frame before sending it to the compositor. The video video frame is then passed to the compositor but the content side will keep a reference to it, because at any moment we can do for example a screenshot of the page, or decide to synchonously pick the current frame and place it in a canvas. These operations need to access the shared texture data and are done synchronously on the content process while the data is being used by the compositor process. Also a given video frame (in a shmem) can be set in any order, at any time, to any number of ImageContainers and be, as a result, used by any number of layers. Tricky business.
Example 2: Buffer rotation. We do an optimization with (non tiled) thebes layers in order to redraw as few pixels as possible. It is called buffer rotation and I won't explain it here but it needs to do a synchronization between the front and the back buffer in which we need to copy data from one to the other. So we send the front buffer to the compositor in a transaction and we read from it to write in the back buffer in the main thread, and swap. Again, here the texture data is read by both the content and the compositor side at the same time.
There are other examples of this, but the bottom line is that in order to reduce the number of copies and the amount of memory we consume, we need to share things between a lot of threads simultaneously and this introduces more complicated ownership problems. It is not possible to fit a clean and simple memory model like SurfaceFlinger's without abandoning some optimization that we do and without changing some of the things Gecko does that have been designed before we thought of compositing on a separate thread.
A notable difference between Gecko's compositing architecture and, say, android's, is that Gecko's texture sharing model is not taken into account in the way content is produced. Optimization like producing content directly in shared memory before giving it to the compositing system does not involve the compositing system itself. As a result it is very hard for the compositing system to be smart and take fast paths. Managing the texture memory outside of the compositing system, for instance, causes massive headaches to get things working without race conditions. Sharing the same texture data to several layers becomes hard, ect.
In short: the texture client owns the memory. If the host deallocates some memory, it is always by request of the client.
At any time both the TextureClient and the texture host can access the shared data. Accessing the data must be done between calls to Lock and Unlock. In general it is best to design CompositableClients and CompositableHosts in a way that the TextureClient never acquires a write lock while the TextureHost wants to read to avoid blocking the compositor thread. This can be done by playing with double buffering and synchronous compositable transactions.
Some TextureClient/Host do not implement locking yet. When using them, we must ensure the client never writes while the host reads as above. This can currently be done in two ways:
- Either force the host to read during synchronous transactions (using the texture flag TEXTURE_IMMEDIATE_UPLOAD)
- Or completely forbid the client to write into the texture after it has been shared by marking the data as immutable (using the texture flag TEXTURE_IMMUTABLE).
Immutable textures are particularly well suited for things like video streams, since the producer will create a new TextureClient for each frame so we know we will not need to write into the same texture twice.
This is a trickier aspect. Deallocating shared data depends on which process should be responsible for it. The lifetime of bith TextureClient and TextureHost is governed by reference counting.
The TextureClient can be kept alive by:
- The producer of the data or indirectly (for instance the Image class holding a reference to a TextureClient).
- The compositable client if the texture is currently in use (front or back buffer).
The TextureHost can be kept alive by:
- The IPDL glue between TextureClient and TextureHost (so a TextureHost *never* dies before its corresponding TextureClient).
- The compositable host if the texture is currently in use.
When the reference count of the TextureClient reaches zero, the Message OpRemoveTexture is sent to the TextureHost which removes the IPDL glue that keeps one of the references to the TextureHost. In most cases this will bring the reference count of the TextureHost to zero and the latter is destroyed. Otherwise it will be destroyed later when the CompositableHost doesn't use it anymore. When the message OpRemoveTexture is sent, there should be nothing capable of accessing the shared data (since the texture client is dead).
The choice of what to do with the shared data depends of the TextureFlags of the TextureClient/Host pair when the message OpRemoveTexture is received by the host side.
- 1) The TEXTURE_DEALLOCATE_CLIENT bit is not set (The most common case): We destroy the shared data on the host side. The client sends the OpRemoveTexture message asynchronously to the host and the host side can destroy the data (or it could decide to keep it alive until the TextureHost is destroyed, but this has not impact on the client side because nothing is holding on to the shared data there).
- 2) The TEXTURE_DEALLOCATE_CLIENT bit is set (when the shared data can only be destoyed by the client thread): The message opRemoveTexture is sent synchronously to the host. When the host receives the message it lets go of all of it's reference to the shared data and sends back the reply ReplyTextureRemoved. at the end of the transaction the client side receives this reply with the guarantee that nothing on the host side is holding on to the shared data, and desroys the shared data.
Textures and Compositables
While texture clients and host abstract out the type of shared memory used, they should not have complex logic concerning how they are used. This kind of logic should go into CompositableClient and CompositableHost.
A compositable pair (client & host) manages one or several textures and implements the logic side of things (such as double buffering or producer/consumer models). There should be different compositable implementations for the different strategies.
To give an analogy with android's SurfaceTexture, CompositableClient should be the equivalent of ANativeWindow, and a given implementation of TextureClient would be the equivalent of SurfaceTexture and implement the same producer/consumer model, while TextureClient/Host operate at a lower level to just abstract out the type memory (since the multiple backends require us to support more than just EGLSurface).
Planned work
 A8 textures D3D9       A8 textures D3D11
 (bug 940959)           (bug 940959)
     (I)                   (III)
      |                     |
      v                     v
 
 New textures D3D9     New textures D3D11
 (bug 900244)          (bug 938591)
       (II)             (IV)
          \             /      New textures
           \           /       on Linux (XVII)
            v         v            
                                   |   New textures software
        New textures by default    |   OMTC (e10s) (XVIII)
        on windows                 |        
              (V)                  |        |       Tiling with new
                |                  |        |       textures (VIII)
                v                  v        v
                                                         |      Surface stream
        new textures are now the default everywhere!     |      with new textures
                      (VI)                               |             (IX)
          |         \                         \   \      |              |
          |          \    ContentClient/Host   \   \_____|____          |
          |           \   new textures (VII)    \        |    \         |
          |            \        |                \       |     \        |
          v             v       v                 v      v      v       v
 
 Remove deprecated    Remove deprecated     Remove deprecated  Remove deprecated  
 ImageClient/Host     ContentClient/Host    tiling             surface stream
         (X)          Single/Buffered          (XII)           client/host
          |           (XI)                        |              (XIII)
          |             |                         |               |
          v             |     IncrementalContent  |               |
                        |     Client/Host         |               |
 Remove deprecated      |     new textures        |              / 
 YCbCr textures         |         (XV)            |             /
         (XIV)          |           |             |            /
              \         |           |             |           /
               \        |           |             |          /
                v       v           v             v         v
 
                  Remove all the deprecated texture classes (XVI)
 
Port ContentClient/ContentHost compositable classes (Bug 893301)
- ContentClientSingleBuffered / ContentHostSingleBuffered
Probably the most important followup, the logic should not change so it should not take "too long" (maybe a week or two).
- ContentClientDoubleBuffered / ContentHostDoubleBuffered
Should be easy once SingleBuffered version is implemented (a day or two)
- ContentClientIncremental / ContentHostIncremental
The logic here needs to be modified a bit, would probably take a week.
Port TiledContentClient / TiledContentHost (Bug 893303)
Current implementation does it's own thing that does not fit even in the Deprecated textures model, so porting it to the new textures may require some investigation. The bright side is that then it would work on B2G. Bas seems to have plans to do that.
Port CanvasClientWebGL / CanvasHostWebGL (Bug 893304)
Just like the tiled classes, right now it does it's own thing, needs some investigation. So porting it to new textures depends on whether we want to do it "properly" or do it like we did for the Deprecated textures. Sotaro has a plan to get WebGL to use ImageBridge + ImageContainer (so this would use ImageClient which already is ported to the new textures.
Rebase and land the basic and D3D11 backends
New textures have been ported to these backends and even r+'d, but with the rush to get the new textures on B2G I haven't taken the timeto land them yet. (See Bug 858914)
Port the D3D9 backend
D3D9 compositing will land soon and has been made for the deprecated texture model I expect it to be similar to D3D11 so I would be surprised that this take more than a week, but I haven't read the code so I don't know for sure.
Implement cross-process locks for TextureClient/TextureHost (Bug 902169)
Having them would let us have more choice in the trade-off speed vs memory usage. They can be implemented in parallel, should be really low priority. This doesn't depend on other items. (windows already has a CrossProcessLock)
- for Linux (for Shmem, shared textures)
- for OSX (for Shmem, shared textures)
This is an improvement for:
- flexibility/memory usage: we will be able to use single buffering in places where we can only double buffer if we don't have a lock
Implement TextureClient pools
This is a performance improvement that is important at least for video: rather than allocating/deallocating shared memory every time, recycle TextureClients (along with their shared buffers and reuse them). We had this feature only with async-video before the all mighty layers refactoring, and lost it during the refactoring. If we implement it now it should work also for non-omtc (thanks to the refactoring). This is an improvement for:
- performances in places where we tend to allocate and deallocate a lot of buffers (like video)
Make it possible for several compositables to use the same Texture.
We need to do this by replacing the pair {PCompositable, ID} by a proper PTexture IPDL actor to refer to texture clients and host accros IPC. This will be a big performance win in certain specific cases, and let us implement the "Texture atlas" optimization for mobile platforms where we want to reduce the number of draw calls. This doesn't have dependencies, although we'll probably see clearer when more compositables are ported? I expect ImageClient/Host to be the only compositable that would use it at least at first, but I may be wrong. See bug 897452
This is an improvement for:
- performances
- memory usage
- Fix a broken feature (same video in several layers).
Remove GrallocBufferActor (Bug 879681)
Gralloc buffer actor is the source of a lot our gralloc related problems. The main problem is that it does not have any memory management (it is not ref counted and there is no ownership defined even though a lot of objects on different thread and processes are holding pointer to it). android::GraphicBuffer on the other hand, has builtin cross-process reference counting which is great except that by wrapping the safely manageed GraphicBuffer in an unsafe non-managed GrallocBufferActor we completely loose the benefits of GraphicBuffer.
GrallocBufferActor was not just a gratuitous evil decision, it serves one purpose that we should keep in mind when we remove it: when we send a gralloc buffer accross IPC we need to do some file descriptor serialization/decerialization that has an overhead, so instead of sending the GraphicBuffer every time, we create the it at the same time as the GrallocBufferActor and we use the GrallocBufferActor as a way to pass the buffer though IPC (matching IPDL actors is faster than doing the FD serialization dance).
The most important aspect of this work is to ensure that objects that need to hold references to gralloc buffers hold references to a reference counted object, even if this object wraps the GrallocBufferActor as a first step. This reference counted object should be GrallocTextureClient, because it has a well defined life-time protocol that is flexible enough to cover all the edge cases that we have met so far.
Once all user of gralloc are doing so through GrallocTextureClientOGL, most of the ownership bugs should be fixed, and the hardest part is done because we then just have to replace GrallocBufferActor by a more genric PTexture protocol that would serve the same purpose also with all the other texture types.
The allocation protocol would be independent from layers (probably live in gfx/ipc) and just use the same ipc channel as layers. This would give us a cleaner design and make the allocator's lifetime not dependent on the ShadowLayerForwarder.
Move TextureClient out of gfx/layers
TextureClient is all about managing the lifetime of shared texture data. This is not specific to layers even though layers happens to be the only user at the moment. This could involve adding a ShareableSurface interface that would expose reference counting, Lock and Unlock. TextureClient would inherit from it and implement the ipc stuff. This is really just low priority sugar to that aims at getting people comfortable with the idea that using TextureClient does not mean depending on layers, and that TextureClient is not an implementation detail. Maybe what we need is to rename texture client into something else.
Implement partial updates for DataTextureSource implementations
This will let us use IncrementalContentHost for non-OpenGL backends. This doesn't have any dependency but will not be useful before IncrementalContentHost is ported to the new textures.
- DataTextureSourceD3D11
- DataTextureSourceBasic
Remove all the IPDL messages that are specific to deprecated classes
This is just cleaning up unsued code, will have to wait until all the compositable classes are ported to the new textures and enabled everywhere. This is an imporvement code maintainability, a lot of bad code will be removed.