Necko: support sending OnDataAvailable() to other threads

At the April 2009 Platform work week we had a meeting on making necko support requests from threads other than the main thread. Here's some design notes on what we came up with.

Contents

Goals

Goal: allow the off-main-thread HTML 5 parser to get data from the network without having to proxy the data from the main thread.

Benefits: greater parallelism and reduced interference between the main thread and the parser thread (resulting in better UI responsiveness).

Note that this is only useful for single-process Gecko (non-electrolysis): at least until IPDL supports sending/receiving on non-main threads, we will always need to use the main thread when multiple processes are used.

API target: sending OnDataAvailable() to other threads

This API allows non-main threads be allowed to directly receive the OnDataAvailable() call from the nsIStreamListener interface. All setup of the network channel before this point (and possibly after) would need to be handled on the main thread (i.e. the non-main thread would not call any necko functions itself: it would instead arrange for the main thread to call them).

The OnDataAvailable() delivery appears to be sufficient for the existing use cases for which a multithreaded necko has been requested: most notably, for an off-main-thread HTML parser.

There still remain, however, some questions about the exact API:

API Discussion

1) Decision to divert data to other thread should only be made within OnStartRequest().

There seemed to be general agreement that the decision to offload OnDataAvailable() to a different thread should only be possible within OnStartRequest(). By this point, we will have the final channel for the request (i.e. any redirects, auth requests, etc, will have been resolved). We will also (usually!) have the Content-type of the request, which we will need to determine whether the data should be handled by an off-thread consumer (like the HTML parser) in the first place.

2) How to notify other-thread listener when request is complete?

Two obvious ways come to mind. 1) We could also divert OnStopRequest() to the listener's thread. Or, 2) we could send an additional OnDataAvailable() with data length == 0 to signal the end of the Request. #1 is more intuitive for the programmer; #2 is more conservative, in that the main thread will still receive OnStopRequest. Since compliant nsChannels will need to be modified anyway, no one could think of a good reason not to change them to no longer expect OnStopRequest(), and so #1 is the current winner, pending discovery of any problems with not delivering OnStopReq() to the main thread.

3) Code would need to query both the nsIChannel and any StreamListeners chained (via nsITraceableChannel) between the nsIChannel and the target listener, to make sure they are ALL safe to be switched to a different thread for OnDataAvailable. (We could modify any relevant existing internal listeners to become safe, but since some listeners could come from extensions, we cannot guarantee all listeners will always be safe). Code must be prepared to handle the case where OnDataAvailable cannot be redirected to the desired thread.

  • 3/6/13: jduell: note this this would mean that any addons that use nsITraceableChannel.setNewListener() (noscript and 71 other addons according to mxr) would silently revert any channels they hook to using main-thread delivery. That sucks. And in general any JS listeners are not going to be able to handle off-main thread delivery. JST suggests we simply fail setNewListener if the new listener is JS (i.e. QI's to nsIXPCWrappedJS). Another possibility is that we add a different asyncOpenToThread() call that takes a different kind of nsIRequestObserver-like IDL, and/or which winds up triggering different (or no) on-modify-request-(offmainthread) notifications. The trick is that we were planning to allow the main vs non-main decision to be delayed until OnStartRequest and that's late in the game... Hmm...

4) The API would probably consist of two new XPCOM interfaces, one for thread-retargetable Channels, one for Listeners (nsIThreadSinkableChannel and nsIThreadSinkableStreamListener?). The Channel interface would have a method (RedirectDeliveryToThread?) that would take a thread object argument, and redirect OnDataAvailable (and OnStopRequest) to that thread. The method would first check the chain of Listeners to ensure that they are all ThreadSink listeners, else the method would fail and the redirect would not happen.

Implemention Questions

  • The main problem with existing Listener classes is apparently (according to Boris) that some of them will have already read data from the HTTP body by the time OnStartRequest() is called. (What's the resulting problem here exactly? Has their OnDataAvailable() already been called on the main thread? Do they have some sort of state that would get messed up by the thread retarget?). Boris suggests that we can make some of these--most notably nsUnknownDecoder--safe by converting them from StreamListeners to nsIContentSniffers, so that they are no longer in the listener chain.
  • If no Content-Type header was present in the HTTP reply, we sniff the content and possibly set up StreamConverters, etc. (the text/plain sniffer was mentioned in particular). We may need to teach StreamConverters to switch threads.
    • Or--since missing Content-Type is quite rare for HTML content, which is all we're currently proposing to parse off-main-thread, we could simply refuse to redirect in this case?
  • TransactionPumps and the HttpProtocolHandler will probably need work to work with thread redirection.
  • We may want to move some logic that currently runs in the main thread into the necko thread: gzip converstion came up in particular.
  • cache: no one seems to know exactly what mods might need to be made to the cache to support multiple simultaneous readers.


Notes: Obstacles to a fully-threaded necko API

  • A large number of classes would need modification: nsURI, nsChannels, IIOService and the various nsIProtocolHandlers.
  • Authorization boxes would need to be popped up from non-main threads
  • Many of the necko APIs have been implemented by Javascript extensions, which would suddenly need to be thread-safe.