Quantum/DOM

From MozillaWiki
Jump to: navigation, search

Goals

The goal of the Quantum DOM project is to eliminate jank caused by background tabs. One of the main ways we intend to do this is to run each tab in its own cooperatively scheduled thread. If a runnable on a background thread takes too long to run, then we will pause its execution and switch to a different thread. To do this correctly, we need to guarantee that web pages never observe a change in behavior. For example, it would be bad if we paused a runnable R1 and then allowed another runnable R2 from the same page to see that R1 had started but not yet finished.

One of the biggest pieces of the project is to "label" runnables with the page that they're operating on. This page describes how to label runnables. Additional information can be found in the brownbag talk here.

Concepts

First, a runnable in this context specifically refers to nsIRunnable. However, note that other common things such an nsITimer implementations use nsIRunnable under the hood. Any code that indirectly (and possibly without the knowledge of the author of the consuming code) uses nsIRunnable under the hood also needs to be modified so that the underlying runnable(s) are labeled.

To more precisely specify when one runnable can observe state from another runnable, we need to define some terminology:

A TabGroup is the set of tabs that are related by window.opener. In a session with four tabs, where T1 opens T2 and T3 opens T4, the TabGroups are {T1, T2} and {T3, T4}. (XXX Is this specifically about sharing the same opener, or about being reachable through an opener chain? That is, if T5 opens T6 which opens T7, are they all in the same TabGroup? And what if a tab is opened using rel=noopener? Does that start a new group?) Once a tab joins a TabGroup, it never leaves it. TabGroups have the property that two tabs from different TabGroups can never observe each other's state. So a runnable from one TabGroup can run while a runnable from a different TabGroup is paused.

A DocGroup is the collection of documents from a given TabGroup that share the same eTLD+1 part of their origins. So if a TabGroup contains tabs with documents {x.a.com, y.a.com, x.b.com, y.b.com}, then these documents will belong to two DocGroups: {x.a.com, y.a.com}, {x.b.com, y.b.com}. DocGroups are essentially a refinement of TabGroups to account for the fact that only same-origin documents can synchronously communicate. (The eTLD+1 part is to account for pages changing their origin by modifying document.domain.) So a runnable from one DocGroup can run while a runnable from a different DocGroup is paused.

The SystemGroup is a group for runnables that do not affect web content.

Labeling

A major task for the Quantum DOM project is to "label" runnables. Labeling refers to giving runnables names, and to associating them with a DocGroup, TabGroup, or the SystemGroup before they're dispatched. Associating runnables with a group is ultimately what will enable Quantum DOM to work, but at the very least runnables should be given a name (grouping can be more tricky than naming). If runnables are at least named then we can gather telemetry about which ungrouped runnables are the most common so that we can focus our efforts on grouping the most important runnables. (Ideally we'll name and group runnables at the same time though.)

You can help by taking one of the unowned bugs to label runnables from the following list:

No results.

0 Total; 0 Open (0%); 0 Resolved (0%); 0 Verified (0%);


Runnable naming

Runnables should be named as follows:

  • If the runnable is a class that you define, its name should be the name of the class. Namespaces can be left off unless they are necessary for disambiguation. For example, if the runnable is simply called Runnable, then a namespace should be included. But if the runnable is mozilla::dom::OffThreadScriptRunnable, then the namespaces can be omitted.
  • If the runnable is a method (created via some variant of NewRunnableMethod), then the class name and method name should form the name. "nsDocument::UpdateVisibilityState" is a good example. Namespaces should be omitted as before unless they're necessary.
  • If the runnable is a function created via NewRunnableFunction, follow the rule as if it were a class.

If the runnable name would be ambiguous and it lives in an anonymous namespace, then make up a namespace that seems right.

Runnable grouping

Using a DocGroup is preferred over using a TabGroup since TabGroup is less specific, but for some runnables the best we can do is give it a TabGroup. Runnables that do not affect web content at all should be labeled using the SystemGroup. However, only use the system group if you are absolutely sure that the runnable will not run any content JS code or have any affect on content DOM or layout.

Given a document, you can find its DocGroup via nsIDocument::GetDocGroup.

Given an inner or outer window, you can find its TabGroup and DocGroup via nsPIDOMWindow::{TabGroup,GetDocGroup}. These methods should only be called on the main thread.

Dispatching

Based on how it is dispatched, there are multiple ways to label a runnable. The simplest way is to provide the DocGroup or TabGroup when dispatching the runnable.

Both the TabGroup and DocGroup classes have Dispatch methods to dispatch runnables. Runnables dispatched in this way will always run on the main thread. You can call Dispatch from any thread. Both TabGroup and DocGroup are threadsafe refcounted. The Dispatch method requires you to name the runnable and provide a "task category". For now, these are for debugging purposes, but the category may be used for scheduling purposes later on.

As a convenience, nsIDocument and nsIGlobalObject have a Dispatch method that will dispatch to their DocGroup. The nsIDocument::Dispatch method can be used on any thread (although you must be careful because nsIDocument is not threadsafe refcounted). The nsIGlobalObject::Dispatch method is main thread only.

Example

Suppose you have a runnable that is dispatched to the main thread. To convert this code, we simply call Dispatch on the document. Here is a diff showing the changes:

 /* virtual */ void
 nsDocument::PostVisibilityUpdateEvent()
 {
   nsCOMPtr<nsIRunnable> event =
     NewRunnableMethod(this, &nsDocument::UpdateVisibilityState);
-  NS_DispatchToMainThread(event);
+  Dispatch("nsDocument::UpdateVisibilityState", TaskCategory::Other, event.forget());
 }

Event Targets

A lot of existing Gecko code currently uses an nsIEventTarget to decide where to dispatch runnables. The DocGroup and TabGroup classes expose EventTargetFor(category) methods that return an nsIEventTarget. Using this event target is equivalent to calling Dispatch on the DocGroup or TabGroup (except that unfortunately no name is provided for the runnable). {TabGroup,DocGroup}::EventTargetFor can be called on any thread. As a convenience, you can also use nsIDocument::EventTargetFor (also callable from any thread) or nsIGlobalObject::EventTargetFor (main thread only).

Runnable names

One disadvantage of using EventTargetFor is that any runnables dispatched this way are not given a name. However, there are other options for assigning a name to runnables. To have a name, a runnable needs to inherit the nsINamed interface and implement its GetName method. mozilla::Runnable already does this, so you can use the SetName method on an existing mozilla::Runnable to set its name.

Usually, though, you'll be using EventTargetFor in cases where you don't have direct access to the runnable. Typically you'll be giving the event target to a sub-system that will dispatch multiple runnables. Timers, the IPC code, and workers are examples of this. In these cases it's best to modify the sub-system to set pass down an appropriate name for the runnable. The IPC code, for example, can set the runnable name to the name of the message being dispatched.

Example

The main-thread XMLHttpRequest class uses several timers that should all be dispatched to the XHR's DocGroup. We can add a SetTimerEventTarget method that dispatches timers to the correct DocGroup:

void
XMLHttpRequestMainThread::SetTimerEventTarget(nsITimer* aTimer)
{
  if (nsCOMPtr<nsIGlobalObject> global = GetOwnerGlobal()) {
    nsCOMPtr<nsIEventTarget> target = global->EventTargetFor(TaskCategory::Other);
    aTimer->SetTarget(target);
  }
}

When using EventTargetFor, please try to set the name of the runnable as well. For timers, the name of the timer runnable is derived from the name of the timer callback (if it implements nsINamed). XMLHttpRequestMainThread is itself the timer callback, so we just need to add a GetName method:

nsresult
XMLHttpRequestMainThread::GetName(nsACString& aName)
{
  aName.AssignLiteral("XMLHttpRequest");
  return NS_OK;
}

Script loader example

As a more complex example, consider off-thread script parsing. When parsing is done, a NotifyOffThreadScriptLoadCompletedRunnable runnable is posted to the main thread. We can modify this code to save an event target while still on the main thread, storing it in the runnable, and then dispatching to that event target off the main thread:

 class NotifyOffThreadScriptLoadCompletedRunnable : public Runnable
 {
   RefPtr<nsScriptLoadRequest> mRequest;
   RefPtr<nsScriptLoader> mLoader;
+  nsCOMPtr<nsIEventTarget> mEventTarget;
   void *mToken;

 public:
   NotifyOffThreadScriptLoadCompletedRunnable(nsScriptLoadRequest* aRequest,
                                              nsScriptLoader* aLoader)
     : mRequest(aRequest)
     , mLoader(aLoader)
+    , mEventTarget(aLoader->GetEventTarget())
   { MOZ_ASSERT(NS_IsMainThread(); }
  ...
 }

For this to work, we need to instrument the nsScriptLoader with an EventTarget method. That's very easy though:

nsIEventTarget*
nsScriptLoader::GetEventTarget() const
{
  return mDocument->EventTargetFor(TaskCategory::Other);
}

Finally, when dispatching, we use the event target. Note that we need to set the runnable name manually:

static void Dispatch(already_AddRefed<NotifyOffThreadScriptLoadCompletedRunnable>&& aSelf) {
  RefPtr<NotifyOffThreadScriptLoadCompletedRunnable> self = aSelf;
  nsCOMPtr<nsIEventTarget> target = self->mEventTarget;
  self->SetName("NotifyOffThreadScriptLoadCompletedRunnable");
  target->Dispatch(self.forget(), DISPATCH_NORMAL);
}

IPC Actors

Many content process runnables are dispatched from IPC. The IPC code allow you to specify an event target for each actor. Any messages received by that actor or its sub-actors will be dispatched to the given event target. You need to specify the event target after the actor is created but before sending the constructor message to the parent process. To do so, call the SetEventTargetForActor on the manager of the new actor. All this must happen only on whichever thread the actor is bound to.

Example

Most networking data comes in via the HttpChannelChild actor. We first create a method that finds the correct event target via the LoadInfo.

void
HttpChannelChild::SetEventTarget()
{
  nsCOMPtr<nsILoadInfo> loadInfo;
  GetLoadInfo(getter_AddRefs(loadInfo));
  if (!loadInfo) {
    return;
  }

  nsCOMPtr<nsIDOMDocument> domDoc;
  loadInfo->GetLoadingDocument(getter_AddRefs(domDoc));
  nsCOMPtr<nsIDocument> doc = do_QueryInterface(domDoc);

  // Dispatcher is the superclass of TabGroup and DocGroup.
  RefPtr<Dispatcher> dispatcher;
  if (doc) {
    dispatcher = doc->GetDocGroup();
  } else {
    // Top-level loads won't have a DocGroup yet. So instead we target them at
    // the TabGroup, which is the best we can do at this time.
    uint64_t outerWindowId;
    if (NS_FAILED(loadInfo->GetOuterWindowID(&outerWindowId))) {
      return;
    }
    RefPtr<nsGlobalWindow> window = nsGlobalWindow::GetOuterWindowWithId(outerWindowId);
    if (!window) {
      return;
    }
    dispatcher = window->TabGroup();
  }

  if (dispatcher) {
    nsCOMPtr<nsIEventTarget> target =
      dispatcher->EventTargetFor(TaskCategory::Network);
    // gNeckoChild holds the NeckoChild singleton actor.
    gNeckoChild->SetEventTargetForActor(this, target);
  }
}

We call this method right before sending the constructor message:

nsresult
HttpChannelChild::ContinueAsyncOpen()
{
  ... // lots of code to setup the channel

  ContentChild* cc = static_cast<ContentChild*>(gNeckoChild->Manager());
  if (cc->IsShuttingDown()) {
    return NS_ERROR_FAILURE;

  SetEventTarget();

  // The socket transport in the chrome process now holds a logical ref to us
  // until OnStopRequest, or we do a redirect, or we hit an IPDL error.
  AddIPDLReference();

  PBrowserOrId browser = cc->GetBrowserOrId(tabChild);
  if (!gNeckoChild->SendPHttpChannelConstructor(this, browser,
                                                IPC::SerializedLoadContext(this),
                                                openArgs)) {
    return NS_ERROR_FAILURE;
  }

Actors constructed by the parent

If the new actor is created on the parent side, normally, it inherits its event target from its manager. If the manager has no event target, you must override the GetConstructedEventTarget method on ContentChild (or whatever the top-level protocol is). All constructor messages are passed to this method. It can return an event target for the new actor or null if no special event target should be used. Be careful, because this method is called on the Gecko I/O thread!

AbstractThread::MainThread

AbstractThread::MainThread() is a singleton of the AbstractThread wrapper class for the main thread and is widely used with MozPromise and its Promise-chain. If you use AbstractThread::MainThread() in your code and you have access to a document or a window, you can replace it with AbstractMainThreadFor provided by the document or window object (or by TabGroup or DocGroup directly).

Example

  already_AddRefed<Promise>
  WebAuthentication::MakeCredential(JSContext* aCx, const Account& aAccount,
                    const Sequence<ScopedCredentialParameters>& aCryptoParameters,
                    const ArrayBufferViewOrArrayBuffer& aChallenge,
                    const ScopedCredentialOptions& aOptions)
  {
    MOZ_ASSERT(mParent);
    nsCOMPtr<nsIGlobalObject> global = do_QueryInterface(GetParentObject());
    if (!global) {
      return nullptr;
    }
    ... // lots of code to initiate the request

    requestMonitor->CompleteTask();

-   monitorPromise->Then(AbstractThread::MainThread(), __func__,
+   monitorPromise->Then(
+     global->AbstractMainThreadFor(TaskCategory::Other), __func__,
      [promise] (CredentialPtr aInfo) {
        promise->MaybeResolve(aInfo);
      },
      [promise] (nsresult aErrorCode) {
        promise->MaybeReject(aErrorCode);
    });
 
    return promise.forget();
  }

FAQ

Q1: What is the impact of leaving a runnable unlabeled?

A: Say the event queue contains these runnables: [F1, B1, F2, B2, F3]. Assume that the F runnables are for the foreground tab and the B runnables are for a background tab. Then we could run F1, F2, and F3 before we run any of the B runnables. That's good.

However, say the event queue contains: [F1, B1, F2, B2, X, F3]. Assume that X is an unlabeled runnable dispatched using NS_DispatchToMainThread. We can run F1 and F2. However, before we can run F3, we need to run B1, B2, and X. That's bad. Why is that?

Well, X could be an F runnable (so maybe X = F2.5). If we ran F1, F2, F3 before the other runnables, then we would run F's events out of order (since F3 would run before F2.5). Therefore we need to run X before we run F3.

However, X might also be a B runnable (so X = B3). Because of that, we need to run B1 and B2 before we run X. In total, B1, B2, and X must run before F3.

So if we find an unlabeled event X in the queue, we need to run all the events before it before we can run any event after it. However, if we label X as a SystemGroup runnable, then we know it has no effect on the F or the B tabs. So it's fine to run it whenever we want. Therefore we could run F1, F2, F3 before any other runnables.

Q2: How should I start labeling?

A: For people who want to participate in these labeling tasks, there is a list of good labeling bugs is linked off bug 1321812. We will use telemetry to decide which runnables are most important, and updates will be posted to that bug. (Note: the analysis of the telemetry is available in bug 1333984.)

For sub-system experts, besides clearing the list above related to your sub-system, we need your help to look for the use cases mentioned in Q3. In general, no matter how frequent the runnable appears in the analysis list, all main-thread runnables in the content process need to be lableled to prevent the impact explained in Q1 because the happened rate will be the accumulation of the happened rates of all these unlabeled runnables. Otherwise, the benefit we get from Quantum-DOM will be significantly reduced.

Q3: What runnables shall be labeled?

A: All the runnables ​dispatched to the main thread of the content process shall be labeled.

In addition to NS_Dispatch(Main|Current)Thread, there are several ways to dispatch a runnable to the main thread implicitly:

  • Any calls to the Dispatch() method of the subclass of nsIEventTarget.
  • The use of AbstractThread::MainThread() (Most of them shall be covered in bug 1314833)
  • The use of nsITimer. (The timer callback will be called by a runnable to the current thread according to nsTimer implementaiton).
  • Handled the received messages in IPC actor childs. (A received message will be handled on main thread with a new runnable if its actor child is created on the main thread.)
  • Subclass of nsExpirationTracker (The overriden method of NotifyExpired() will be triggered implicitly with a new runnable by the internal timer in nsExpirationTracker implementation. See the dependency tree of bug 1345464 for the list of the sub-classes.)
  • The use of AsyncWait() on any nsIAsync(In|Out)putStream or the use of NS_NewInputStreamPump(). (These use cases trigger new runnables named (In|Out)putStreamReadyEvent to the specified nsIEventTarget in the invocation of AsyncWait(). These runnables need to be labeled if the nsIEventTarget points to the main thread.)
  • The use of NS_ProxyRelease on main thread, NS_ReleaseOnMainThread, or nsMainThreadPtrHolder.
  • All IPC Actor Childs created on the main thread have to be labeled.

Q4: How do I specify a label for an nsITimer TimerCallback?

A: This can be done by nsITimer::SetTarget(nsIEventTarget*) combined with EventTargetFor. See the XMLHttpRequest example above.

Q5: Will the TaskCategory be used for prioritizing in the future?

If it will, how can developers ensure that the ordering won't be a problem if tasks are labeled in different categories in current developing stage?

A: It may be used for scheduling in the future. Generally, anything that's not marked as Other should come from some outside source (usually IPC, but also from a timer). In that case, we have a lot of freedom about when to schedule it. So there shouldn't be too many concerns about breaking things as long as people follow that rule.

Q6: Do runnables need to be labeled in the main process?

A: No. Only content process runnables need to be labeled. But labeling other runnables won't break anything.

Q7: How to name an annoymous runnable before supporting (Doc|Tab|System)Group labeling?

A: Name can be given according to the type of annoymous runnables

  • Subclasses of (Cancelable)Runnable classes which is displayed as "anonymous runnable" in the runnable analysis of bug 1333984.
    • To rename these runnables, you can invoke base classes constructor of (Cancelable)Runnable(const char* aName) in the subclass constructor.
    • If your runnable instance was created from NewRunnableFunction(aFunction, ...) or NewRunnableMethod(aMethod, ...), you can call the the overloaded functions of NewRunnableFunction(aName, ...) or NewRunnableMethod(aName, ...) to rename these runnables.
  • Subclasses of nsIRunnable but not Runnable classes which is displayed as "non-nsINamed runnable" in the runnable analysis of bug 1333984.
    • To rename these runnables, you can either change the base class to Runnable or inherit nsINamed class and override its GetName method.
  • The timer callbacks of nsITimer::Init() or nsITimer::InitWith(FuncCallback|Callback) which are displayed as "Anonymous_observer_timer", "Anonymous_callback_timer", and "Anonymous_interface_timer" in the runnable analysis of bug 1333984.
    • To rename this timercallbacks, you can either have your nsISupport/nsIObserver class inheriting nsINamed class or call the named version of Timer::InitWith(NamedFuncCallback|initWithNameableFuncCallback) for your nsTimerCallbackFunc instance.