Platform/JSDebugv2

From MozillaWiki
Jump to: navigation, search

This is a DRAFT.

Comments are welcome on dev-tech-js-engine(at)lists.mozilla.org; or, you can send them directly to me at jimb(at)mozilla.com.

Goals

  • SpiderMonkey's JavaScript debugging API must support close collaboration with sibling APIs for debugging other web technologies: not only DOM structure, CSS rules, and networking requests, but also upcoming tools like worker threads and local storage. These technologies were designed to interact with each other, and useful debugging tools must illuminate those interactions.
  • The debugging API must support the creation of robust debugging tools. Mozilla's current debugging tools are plagued with problems stemming from the debugger having unintended effects on the debuggee: because both run in the same process, they share an event loop, chrome, and (to some extent) JavaScript objects. Our design should strengthen the isolation between the two, making debugging more reliable.
  • The debugging API must be able to debug web worker threads. Web workers allow computational tasks to run concurrently with ordinary content JavaScript. A page can have many worker threads, and workers can spawn subworker threads. The debugging API should allow debuggers to enumerate worker threads and monitor their execution and interactions, just as it does for content JavaScript.
  • The debugging API must support remote debugging. Mobile devices often have restricted user interfaces; it should be possible for the debugger's user interface to run on a workstation or laptop, while inspecting a debuggee on the mobile device.
  • The debugging API should be prepared to support separate content processes, if Mozilla implements them.
  • The debugging API must support our evolving JavaScript implementation. With its bytecode interpreter, the TraceMonkey tracing just-in-time compiler, and now Jägermonkey, the method-at-a-time compiler, SpiderMonkey has three distinct ways of executing JavaScript code. We should be able to debug programs that have been compiled to machine code, and not force SpiderMonkey to revert to the slowest implementation technique.

Design Summary

(Even though it hasn't been implemented yet, this description uses the present tense, for clarity and ease of transition to summary documentation.)

Debugger user interfaces communicate with the application being debugged via a remote debugging protocol. The protocol is JSON-based, with clients and servers typically implemented in JavaScript. Each packet from the client is directed at a specific actor on the server, representing a thread, breakpoint, JavaScript object, or the like; each packet from the server comes from a specific actor.

Every server provides a root actor that can provide global information about the application ("I am a web browser"), and enumerate the potential debuggees present in the application—tabs, worker threads, chrome, and so on—each of which is represented by its own actor.

Actors representing individual JavaScript threads use the jsd2IDebuggerService Web IDL interface to inspect and manipulate the debuggee they represent. jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService, implemented in terms of the js::dbg2 C++ interfaces.

The server interacts with debuggees running in other threads simply by passing entire JSON packets between the client and actor code running on those threads. Thus, all inter-thread communication is handled via the protocol, permitting thread actors and the interfaces they use to be single-threaded and simplifying their implementation. Communication with subprocesses can be handled the same way.

The jsd2IDebuggerService Web IDL interface presents js::dbg2's facilities to JavaScript. jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService.

The js::dbg2 interface provides functions to:

  • select code of interest to the developer (everything in a tab; a selected frame within a tab; chrome; and so on),
  • establish breakpoints, watchpoints, and other sorts of monitoring, and be notified when events of interest occur,
  • inspect and manipulate stack frames, scope chains, objects, and other such members of the JavaScript menagerie.

Architecture-new.png

Debugging Protocol

Remote debugging, in which the debugger's user interface can run in a separate process from the debuggee and communicates with the debuggee over a stream connection, addresses many of our goals at once:

  • A debugger running in a separate process from the debuggee is easier to make robust. The debugger's user interface and the debuggee need not share an event loop or a chrome DOM tree.
  • Remote debugging eases mobile development. The debugger could run on a desktop computer, and operate on a debuggee on a mobile device.
  • The remote protocol can handle almost all inter-thread communication. Each actor runs on the same thread as the debuggee it represents, so actor/debuggee interactions are intra-thread, and need not worry about synchronization or shared state. Actors and the application's main server interact only by exchanging protocol packets. The debugger user interface simply needs to be able to talk to more than one agent at a time.

    (Note that some operations are inherently cross-thread: enumerating currently running threads; thread creation notifications; the initial attachment of the debugger to a thread. But once a thread has been attached to, all subsequent communication can be via the remote protocol.)

The js::dbg2 Interfaces and jsd2IDebuggerService

The js::dbg2 interfaces, wrapped for JavaScript as the jsd2IDebuggerService, allow the debugger to select the code to debug, set breakpoints and watchpoints and otherwise express interest in debuggee behaviors, and inspect the debuggee's state.

The js::dbg2 interfaces operate at a higher level than jsd. Whereas jsd works in terms of the original SpiderMonkey bytecode interpreter—JSScript objects, bytecode offsets, JSStackFrame objects, and so on—the js::dbg2 interfaces operate at the JavaScript source code and value level, and avoid referring to specifics of the implementation. This makes it easier to support debugging of TraceMonkey- and JägerMonkey-compiled code: such code need not present its state in terms of an older intermediate representation that it doesn't use.

Like jsd, js::dbg2 provides grip objects that refer to values in the debuggee. The debugger can inspect the object's properties, their attributes, and so on via the grip without accidentally invoking getters or setters, making it easier to write secure and robust debuggers.

Also like jsd, js::dbg2 provides grip objects referring to JavaScript stack frames. However, there is no necessary correspondence between js::dbg2 stack frame grips and SpiderMonkey's internal JSStackFrame objects. SpiderMonkey's JITs are free to report the current function activations to js::dbg2 in whatever way is most convenient to them; they are not required to synthesize JSStackFrame objects, which must satisfy complex internal constraints.

Tasks and Estimates

Note that all estimates include time to write unit tests providing full code and branch coverage for new code.

JS_CopyScript JSAPI function (8 days)
Implement, document, and test a function that makes a fresh, deep copy of a JSScript object, suitable for execution in a thread or global object different than the original JSScript.

For various reasons, SpiderMonkey is moving towards restricting each JSScript to be used with a single global object (the next task; see details there). Before we can impose this requirement, we must make it possible for embedders to comply with it by providing a function which copies a JSScript object.

Associate JSScripts with specific global objects (5 days)
Add a 'global' field to JSScript, and change JS_ExecuteScript to clone JSScript objects if necessary to match the global object passed.

This is needed to allow us to enumerate all the scripts in use by a particular global object, along with several other current SpiderMonkey goals; see bug 563375#c4. We can accomplish this by having JS_ExecuteScript use copies of JSScripts owned by globals other than the one passed to it.

Change JSRuntime::scriptFilenameTable to use js::HashMap (3 days)
Since subsequent tasks will involve changing the data structures used to store script source URLs, we should grant ourselves the benefits of strict typing provided by the new js::HashMap template.
Create name-to-script mapping (8 days)
Adapt the existing hash table of script names to also function as a map from script names to scripts. This entails adding links to JSScript objects, arranging for entries in scriptFilenameTable to head chains of scripts, and having garbage collection properly remove scripts from their names' lists.
Script URL enumeration (5 days)
Define a function to enumerate the URLs of all scripts associated with a given global object.

Debugger user interfaces need to be able to present the user with a list of the scripts in use by a particular page or origin, so that the user can browse their source code, set breakpoints, and so on. These lists should include only those scripts in use by the page or origin being debugged.

Draft C++ js::dbg2 breakpoint API (3 days)
Write a C++ API declaring:

  • A class representing a position at which a breakpoint can be set, expressed in terms of textual positions (URL, line, and column) or in terms of function names (a global object, a series of containing function names, and a final function name), or in terms of specific function objects.

    The API should permit the "grammar" of breakpoint locations to be extended in the future (to describe, say, function-valued properties in object literals).

    These should be designed such that, in normal, efficent use, no explicit storage management (new/delete) is required.

    URLs in breakpoint locations should be represented as entries in the runtime's scriptFilenameTable. This means that, given a breakpoint location, we have immediate access to the list of JSScripts derived from the source code to which the location refers.

    If possible, the URL/line/column variant of this type should be suitable for use by the js::dbg2 stack frame type to represent source positions; we should not need two distinct types that represent locations in source code.

  • A class representing a breakpoint, js::dbg2::Breakpoint, which can be inserted in or removed from a debugging sphere. This API will not be concerned with breakpoint conditions, ignore counts, and such; those behaviors must be implemented by the client of the js::dbg2 interface.
  • A stub js::dbg2::Sphere class, sufficient for bootstrapping, constructed from a given global object.
  • Debugging sphere member functions for enumerating the currently inserted breakpoints.

Implement Breakpoint Location Classes (5 days)
Implement the classes described above describing breakpoint locations. There may be some tricky work here, as we want to have entries in the scriptFilenameTable that are live because they are referred to by breakpoint location objects, not scripts, and have entries cleaned up as appropriate.
Implement js::dbg2::Breakpoint(15 days)
Implement the js::dbg2::Breakpoint class, including insertion and removal. This entails:

  • turning the various sorts of breakpoint locations into JSScript,offset pairs
  • searching JSScript lists to insert and remove traps
  • managing multiple breakpoints set at the same bytecode
  • inserting traps for existing breakpoints into newly loaded code (pending breakpoints)
  • coping with scripts being garbage collected
  • interlocking with JägerMonkey to insure that breakpoints are never set in functions that have JM frames on the stack

Use function start positions when re-setting breakpoints (8 days)
When re-loading a previously loaded script, we should use our knowledge of function boundaries to improve our accuracy as we re-set breakpoints in the new script. If all changes to a script lie outside a given function's definition, then treating the breakpoint as if it were set relative to the function's start, rather than at an absolute line and column, will allow us to find a better location for it in the new script.
Expand source notes to carry column information (8 days)
Extend the source notes attached to JSScripts to carry both line and column information. This allows debugging of poorly-formatted code such as that produced by script compressors or obfuscators. The bytecode compiler already tracks column numbers; they're simply not recorded in the source notes.

Note that this need not imply any increase in the size of notes for normally formatted source code: the granularity of the features distinguished by the source annotations (that is, statements) need not change. Only if there were multiple statements or functions on the same line would column numbers be needed to distinguish them.

Links