Platform/GFX/textures: Difference between revisions

From MozillaWiki
< Platform‎ | GFX
Jump to navigation Jump to search
No edit summary
Line 75: Line 75:
A compositable pair (client & host) manages one or several textures and implements the logic side of things (such as double buffering or producer/consumer models). There should be different compositable implementations for the different strategies.
A compositable pair (client & host) manages one or several textures and implements the logic side of things (such as double buffering or producer/consumer models). There should be different compositable implementations for the different strategies.


To give an analogy with android's SurfaceTexture, CompositableClient should be the equivalent of ANativeWindow, and a give implementation of TextureClient would be the equivalent of SurfaceTexture and implement the same producer/consumer model, while TextureClient/Host operate at a lower level to just abstract out the type memory (since the multiple backends require us to support more than just EGLSurface).
To give an analogy with android's SurfaceTexture, CompositableClient should be the equivalent of ANativeWindow, and a given implementation of TextureClient would be the equivalent of SurfaceTexture and implement the same producer/consumer model, while TextureClient/Host operate at a lower level to just abstract out the type memory (since the multiple backends require us to support more than just EGLSurface).


== Migration to the new textures (Meta bug 893300)==
== Migration to the new textures (Meta bug 893300)==

Revision as of 13:18, 12 September 2013

TextureClient & TextureHost

TextureClient and TextureHost are the way to share texture memory accross threads or processes.

For more information read the inline documentation

TODO[nical] much more goes here

New vs Deprecated textures

Depreacted texture clients and host suffer from a mix of imperfect design decisions (design decisions that did not work well in the specific case of B2G's memory model coupled with some of the funky sharing in Gecko) and the pile of quick hacks that had to be done to keep them around. As a result, deprecated texture clients and host are hard to reason about and bug prone on B2G.

In order to incrementally fix the badness of TextureClient and TextureHost, the classes have been marked deprecated and new texture clients and host were designed, using a more strict and defined memory model. The goal is now to replace all the usage of DeprecatedTexture* by the new classes. When writing new code, if possible please use the new classes.

The biggest difference between deprecated and new textures is that deprecated did not own any shared data. Look at them as channels through which several SurfaceDescriptors were sent from one side to the other. This caused ownership problems because Surface descriptor doesn't have any notion of ownership (it is just an IPDL generated structure for serialization). New textures on the other hand fully own their shared data. There should be no object referring to data shared with the compositor that is not doing so through a TextureClient (or TextureHost). This is important because TextureClient and TextureHost define a strict ownership protocol that is designed to cover all the (numerous) use cases in Gecko. So there is always one and only one TextureClient/Host pair per shared buffer. If there is need to send a new buffer, then a new TextureClient/Host pair is created along with it. This way it is possible to track the lifetime of buffers that are shared between several threads on the content side, plus the compositor thread (which lives in a separate process), which was not possible with the derpecated texture (the latter supposed that we could get away with loosing ownership of the buffers when sending them accross IPC, which turned out to be incompatible with the way Gecko works for some things like video and buffer rotation).

Why is sharing textures between the content side and the compositor side so complicated?

That is a valid question. If you read code from other open-source compositors and compositor protocols such as SurfaceFlinger and Wayland, you will notice that they have simpler memory ownership models. In the case of SurfaceFlinger for instance, you have a simple producer/consumer model where ownership is transfered from one side to the other along with the data. Why not do the same in Gecko? Well, we tried and that's what the deprecated textures should have been. It turned out that the way Gecko works, we need to shared textures with the compositor while still being able to read it from *any* thread on the content side.

Example 1: the video pipeline. When a video frame is produced we try to put it in shared memory as early as possible to avoid making too many copies of the frame before sending it to the compositor. The video video frame is then passed to the compositor but the content side will keep a reference to it, because at any moment we can do screenshot of the page on the main thread which will need to access the texture data.

Example 2: Buffer rotation. We do an optimization with (non tiled) thebes layers in order to redraw as few pixels as possible. It is called buffer rotation and I won't explain it here but it needs to do a synchronization between the front and the back buffer in which we need to copy data from one to the other. So we send the front buffer to the compositor in a transaction and we read from it to write in the back buffer in the main thread, and swap. Again, here the texture data is read by both the content and the compositor side at the same time.

There are other examples of this, but the bottom line is that in order to reduce the number of copies and the amount of memory we consume, we need to share things between a lot of threads simultaneously and this introduces more complicated ownership problems. It is not possible to fit a clean and simple memory model like SurfaceFlinger's without abandonning some optimizations that we do and without changing some of the things Gecko does that have been designed before we thought of compositing on a separate thread.

A notable difference between Gecko's compositing architecture and, say, android's, is that Gecko's texture sharing model is not taken into account in the way content is produced. Optimizations like producing content directly in shared memory before giving it to the compositing system does not involve the compositing system itself. As a result it is very hard for the compositing system to be smart and take fast paths. Managing the texture memory outside of the compositing system, for instance, causes massive headaches to get things working without race conditions. Sharing the same texture data to several layers becomes hard, ect.

who owns the shared data with the new textures?

In short: the texture client owns the memory. If the host deallocates some memory, it is always by request of the client.

Accessing the shared data

At any time both the TextureClient and the texture host can access the shared data. Accessing the data must be done between calls to Lock and Unlock. In general it is best to design CompositableClients and CompositableHosts in a way that the TextureClient never acquires a write lock while the TextureHost wants to read to avoid blocking the compositor thread. This can be done by playing with double buffering and synchronous compositable transactions.

Some TextureClient/Host do not implement locking yet. When using them, we must ensure the client never writes while the host reads as above. This can currently be done in two ways:

  • Either force the host to read during synchronous transactions (using the texture flag TEXTURE_IMMEDIATE_UPLOAD)
  • Or completely forbid the client to write into the texture after it has been shared by marking the data as immutable (using the texture flag TEXTURE_IMMUTABLE).

Immutable textures are particularly well suited for things like video streams, since the producer will create a new TextureClient for each frame so we know we will not need to write into the same texture twice.

Deallocating the shared data

This is a trickier aspect. Deallocating shared data depends on which process should be responsible for it. The lifetime of bith TextureClient and TextureHost is governed by reference counting.

The TextureClient can be kept alive by:

  • The producer of the data or indirectly (for instance the Image class holding a reference to a TextureClient).
  • The compositable client if the texture is currently in use (front or back buffer).

The TextureHost can be kept alive by:

  • The IPDL glue between TextureClient and TextureHost (so a TextureHost *never* dies before its corresponding TextureClient).
  • The compositable host if the texture is currently in use.

When the reference count of the TextureClient reaches zero, the Message OpRemoveTexture is sent to the TextureHost which removes the IPDL glue that keeps one of the references to the TextureHost. In most cases this will bring the reference count of the TextureHost to zero and the latter is destroyed. Otherwise it will be destroyed later when the CompositableHost doesn't use it anymore. When the message OpRemoveTexture is sent, there should be nothing capable of accessing the shared data (since the texture client is dead).

The choice of what to do with the shared data depends of the TextureFlags of the TextureClient/Host pair when the message OpRemoveTexture is received by the host side.

  • 1) TEXTURE_DEALLOCATE_HOST (The most common case): We destroy the shared data on the host side. The client sends the OpRemoveTexture message asynchronously to the host and the host side can destroy the data (or it could decide to keep it alive until the TextureHost is destroyed, but this has not impact on the client side because nothing is holding on to the shared data there).
  • 2) TEXTURE_DEALLOCATE_CLIENT (when the shared data can only be destoyed by the client thread): The message opRemoveTexture is sent synchronously to the host. When the host receives the message it lets go of all of it's reference to the shared data and sends back the reply ReplyTextureRemoved. at the end of the transaction the client side receives this reply with the guarantee that nothing on the host side is holding on to the shared data, and desroys the shared data.
  • 3) none of the two flags are set: This means we want the shared data to be managed by something outside of layers. This is the case of the Gonk camera for instance. Basically this is identical to scenraio 2) but the texture is not deallocated upon receival of ReplyTextureRemoved. In this case the best is to add a mechanism to inform the external system that the buffer is not used anymore by layers when ReplyTextureRemoved is received. This has been specifically designed to fix some of the Gonk camera problems that we are having where both the camera and layers think they are owning the data at the same time, but the hook has not been implemented yet (TODO update this paragraph when it has been implemented!).

Given how things are evolving with B2G's camera (which is one of the very tricky cases to implement), we may be able to merge the case where TEXTURE_DEALLOCATE_CLIENT is set with the case where none of the two bits are set. This should simplify a few things. So we would just have one bit telling whether the shared data is deallocated on the client or the host side.

Textures and Compositables

While texture clients and host abstract out the type of shared memory used, they should not have complex logic concerning how they are used. This kind of logic should go into CompositableClient and CompositableHost.

A compositable pair (client & host) manages one or several textures and implements the logic side of things (such as double buffering or producer/consumer models). There should be different compositable implementations for the different strategies.

To give an analogy with android's SurfaceTexture, CompositableClient should be the equivalent of ANativeWindow, and a given implementation of TextureClient would be the equivalent of SurfaceTexture and implement the same producer/consumer model, while TextureClient/Host operate at a lower level to just abstract out the type memory (since the multiple backends require us to support more than just EGLSurface).

Migration to the new textures (Meta bug 893300)

The idea is to add new compositable classes beside the deprecated ones and to remove the deprecated ones when the new ones considered stable enough on all platforms. This is an improvement for:

  • Code maintainability (we can remove deprecated code and use the new one that is much easier to maintain)
  • In some cases it opens the door for performance improvement

Port ContentClient/ContentHost compositable classes (Bug 893301)

  • ContentClientSingleBuffered / ContentHostSingleBuffered

Probably the most important followup, the logic should not change so it should not take "too long" (maybe a week or two).

  • ContentClientDoubleBuffered / ContentHostDoubleBuffered

Should be easy once SingleBuffered version is implemented (a day or two)

  • ContentClientIncremental / ContentHostIncremental

The logic here needs to be modified a bit, would probably take a week.

Port TiledContentClient / TiledContentHost (Bug 893303)

Current implementation does it's own thing that does not fit even in the Deprecated textures model, so porting it to the new textures may require some investigation. The bright side is that then it would work on B2G.

Port CanvasClientWebGL / CanvasHostWebGL (Bug 893304)

Just like the tiled classes, right now it does it's own thing, needs some investigation. So porting it to new textures depends on whether we want to do it "properly" or do it like we did for the Deprecated textures.

Rebase and land the basic and D3D11 backends

New textures have been ported to these backends and even r+'d, but with the rush to get the new textures on B2G I haven't taken the timeto land them yet. (See Bug 858914)

Port the D3D9 backend

D3D9 compositing will land soon and has been made for the deprecated texture model I expect it to be similar to D3D11 so I would be surprised that this take more than a week, but I haven't read the code so I don't know for sure.

Remove all the IPDL messages that are specific to deprecated classes

This is just cleaning up unsued code, will have to wait until all the compositable classes are ported to the new textures and enabled everywhere. This is an imporvement code maintainability, a lot of bad code will be removed.

Other texture-related plans

The items below are things we want to do with new textures, but if we don't do them we will still be on par with the deprecated textures. So I expect them to be lower priority, at this point we are mostly interested in the items above.


Remove GrallocBufferActor (Bug 879681)

Gralloc buffer actor is the source of a lot our gralloc related problems. The main problem is that it does not have any memory management (it is not ref counted and there is no ownership defined even though a lot of objects on different thread and processes are holding pointer to it). android::GraphicBuffer on the other hand, has builtin cross-process reference counting which is great except that by wrapping the safely manageed GraphicBuffer in an unsafe non-managed GrallocBufferActor we completely loose the benefits of GraphicBuffer.

GrallocBufferActor was not just a gratuitous evil decision, it serves one purpose that we should keep in mind when we remove it: when we send a gralloc buffer accross IPC we need to do some file descriptor serialization/decerialization that has an overhead, so instead of sending the GraphicBuffer every time, we create the it at the same time as the GrallocBufferActor and we use the GrallocBufferActor as a way to pass the buffer though IPC (matching IPDL actors is faster than doing the FD serialization dance).

The most important aspect of this work is to ensure that objects that need to hold references to gralloc buffers hold references to a reference counted object, even if this object wraps the GrallocBufferActor as a first step. This reference counted object would be GrallocTextureClient, because it has a well defined life-time protocol that is flexible enough to cover all the edge cases that we have met so far.

Once all user of gralloc are doing so through GrallocTextureClientOGL, most of the ownership bugs should be fixed, and the hardest part is done because we then just have to replace GrallocBufferActor by a more genric PTexture protocol that would serve the same purpose also with all the other texture types.

Implement partial updates for DataTextureSource implementations

This will let us use IncrementalContentHost for non-OpenGL backends. This doesn't have any dependency but will not be useful before IncrementalContentHost is ported to the new textures.

  • DataTextureSourceD3D11
  • DataTextureSourceBasic

Implement cross-process locks for TextureClient/TextureHost (Bug 902169)

Having them would let us have more choice in the treade-off speed vs memory usage. They can be implemented in parallel, should be really low priority. This doesn't depend on other items. (windows already has a CrossProcessLock)

  • for Linux (for Shmem, shared textures)
  • for OSX (for Shmem, shared textures)

This is an improvement for:

  • flexibility/memory usage: we will be able to use single buffering in places where we can only double buffer if we don't have a lock

Implement TextureClient pools

This is a performance improvement that is important at least for video: rather than allocating/deallocating shared memory every time, recycle TextureClients (along with their shared buffers and reuse them). We had this feature only with async-video before the all mighty layers refactoring, and lost it during the refactoring. If we implement it now it should work also for non-omtc (thanks to the refactoring). This is an improvement for:

  • performances in places where we tend to allocate and deallocate a lot of buffers (like video)


Make it possible for several compositables to use the same Texture.

We need to do this by replacing the pair {PCOmpositable, ID} by a proper PTexture IPDL actor to refer to texture clients and host accros IPC. This will be a big performance win in certain specific cases, and let us implement the "Texture atlas" optimization for mobile platforms where we want to reduce the number of draw calls. This doesn't have dependencies, although we'll probably see clearer when more compositables are ported? I expect ImageClient/Host to be the only compositable that would use it at least at first, but I may be wrong. See bug 897452

This is an improvement for:

  • performances
  • memory usage