Multiprocess Images

From MozillaWiki
Jump to: navigation, search

Overview

The Electrolysis project is in full swing, and so we should soon have the capability to run each tab as a separate content processes. This impacts everything that is normally shared among tabs, and the image cache is no exception. Currently, all requests for images from content-land go through the shared imgLoader. The imgLoader owns the actual imgRequest, and it hands out imgRequestProxies that act as handles for interested callers. Depending on whether the image data is in the cache, these handles point to imgContainers that contain all, some, or none of the data (in the latter two cases, the containers are asynchronously filled by data from the network, as it arrives).

A simple solution for images in a content-process world is to run a separate imgLoader (requests + cache) in each content process. This is undesirable, because it means we can end up downloading and decoding the same image multiple times unnecessarily (TODO-elaborate).

Some of these effects may be ameliorated by the necko-layer cache. If someone with a good understanding of how that works wants to weigh in, your input would be appreciated. Furthermore, these effects are highly dependant on browsing patterns. This point is crucial, because they won't show up on Tp assuming pages are loaded sequentially in the same content process. We should investigate a way to measure performance in this regard, perhaps with the help of browsing pattern statistics from Test Pilot.

A better solution involves sharing the imgLoader, in effect creating an "image server" that the content processes would interface with over IPC to do the image loading. This could either run in the chrome process or in an entirely separate process. This is one thing that needs to be decided. Since all the networking is happening in the chrome process, it would probably make the most sense to do it there (so that the incoming data doesn't have to jump processes twice). However, it's possible that security advantages or the performance gains on multicore cpus of decoding in a separate process could outweigh this.

Goals

  • Share image decoding and caching between all content processes
  • Disrupt the API of requesting and accessing images as little as possible (preferably not at all). This will mean more significant changes on the libpr0n side, but will allow us to largely localize and encapsulate the disruption.
  • Negligible or no performance regression between single tab browsing between the current implementation and the image server implementation
  • The image server should provide the clients with data in an asynchronous manner so that clients are never blocked on the server

Proposed Implementation Strategy

We use the imgLoader as the primary interface and bridge between the clients and the server. The imgLoader class proper remains in client space to keep the interface sane, but much of the functionality within it, notably the cache, gets moved to server space. imgRequests sit squarely in sever space. imgRequestProxies have their functionality split, but mostly reside in client space. To be more specific, the 2 sides of the imgLoader are outlined below.

imgLoader (client-side)

  • Maintains the imgILoader interface, with the primary function call being loadImage.
  • Calls to loadImage send an asynchronous message to the server and do not block. The function creates an imgRequestProxy and return.
  • The loader sets up the relevant callbacks so that asynchronous messages from the image server can fill in the appropriate pieces of data in the imgRequestProxy

imgServer (server-side)

  • Initialized and shut down by the chrome process (if the image server is a separate process, this would involve process creation and termination)
  • Listens for client processes to register and de-register themselves (this process may be able to piggyback on existing chrome-content IPC if we put the server in the chrome process)
  • Has an IPC channel with each client (see IPC Protocols)
  • Has all of the cache code currently in imgLoader
  • Receives asynchronous messages from the client-side imgLoader, inspects the cache, and either fires off a fully-fledged imgRequest or retrieves it from the cache
  • Attaches imgClientProxies to the requests, in the manner currently done with imgRequestProxies, in order to track which clients are interested in what
    • do we like the name imgClientProxy? Does it suggest that the structure is a proxy for the client (as intended) or that it sits in client-space (not true). Thoughts?
  • As data becomes available for a given imgRequest, send asynchronous IPC messages to all interested clients with the relevant data.
    • This is probably best accomplished with an 'incrementally-updated-mirror' approach where the server builds the relevant data structures (imgContainer, gfxImageFrame, nsImage, etc) with a special flag set where each change sets the local sets the local data structure and also pushes the change across the pipe.


Performance

Sending image attributes (dimensions, animation properties, etc) over the IPC pipe should be fine. However, we probably want to implement some sort of shared memory passing of the decoded image data itself. There's some mention on the IPC wiki of having this kind of support built into the transport system, but the specifications of how this is supposed to work are still a bit vague. In general, we'll probably want the images to be decoded into server-side buffers and have those buffers mapped as readonly shmem regions into the address space of the content processes. Each gfxImageFrame of the client image would then have its mImageData pointed to the appropriate shmem buffer. We need to decide whether we want to map each buffer individually, of if it would be better to map a large region, use a malloc implementation to allocate buffers, and send offsets to the clients. This can be considered an optimization, and the prototype implementation will ignore this and send the image data through the IPC pipe like everything else. We can then use Tp to get an idea of the performance impact.


Issues and Questions

  • Should the image server sit in the chrome process or separate process? If a separate process, 2 other points:
    • We need to decide how, if at all, it interacts with low-memory notifications. Does it clear the cache when the chrome process is low on memory? Does it have its own barometer? Does it ignore the issue entirely?
    • Is there any reason for chrome image requests to go through the image server?
  • If a content process goes down, are we guaranteed by the IPC mechanism to hear about it? If not, what do we do when a client goes down without releasing the proxy hold on an image in the server cache?
  • How does all this fit into the existing security architecture?
  • Do we want to map the shmem regions individually or use one big region?
    • Is it a violation of the eventual security model if one content process can read (not write) the image data of other processes? This would effectively nix the one-big-region strategy.
  • How easy is it going to be to get Tp numbers on Electrolysis code? The tryserver is the usual route here, but we'll probably need some support from RelEng to make this happen.

Contact

Contact bholley (bobbyholley@stanford.edu) with any feedback or input.