Sfink/Draft - GC Pointer Handling: Difference between revisions

 
(One intermediate revision by the same user not shown)
Line 25: Line 25:
To deal with this, we have provided a new API for temporarily marking local variables as roots. The above code becomes:
To deal with this, we have provided a new API for temporarily marking local variables as roots. The above code becomes:


   JSRootedObject obj1(cx, JS_NewObject(...));
   Rooted<JSObject*> obj1(cx, JS_NewObject(...));
   JSRootedObject obj2(cx, JS_NewObject(...));
   Rooted<JSObject*> obj2(cx, JS_NewObject(...));
   JS_DefineProperty(obj1, ...);
   JS_DefineProperty(obj1, ...);


Note that obj1 parameter of JS_DefineProperty has been modified to be of type 'JSHandleObject'.
Note that obj1 parameter of JS_DefineProperty has been modified to be of type 'Handle<JSObject*>'.


A JSRootedObject is treated as a GC root until the function returns. A JSHandleObject is a reference to a rooted object. (JSRootedObject is like a rooted 'const JSObject &'.) Anything stack-allocated (local variables, parameters) needs to be rooted if it is live across a call that could trigger the garbage collector. You should generally assume that any JSAPI call could invoke the garbage collector unless labeled otherwise -- you might at first expect that only JSAPI calls that create things could GC, but in fact many operations will create gcthings internally. For example, anything that could throw a JS exception can GC, because the exception object will be allocated. (And just about anything that accepts a gcthing pointer as a parameter can throw a JS exception -- eg, if the object is a proxy for something in a different compartment, the compartment crossing may be disallowed, which would trigger an exception to be thrown.)
A Rooted<JSObject*> is treated as a GC root until the function returns. A Handle<JSObject*> is a reference to a rooted object. (Handle<JSObject*> is like a const ref to a rooted JSObject*.) Anything stack-allocated (local variables, parameters) needs to be rooted if it is live across a call that could trigger the garbage collector. You should generally assume that any JSAPI call could invoke the garbage collector unless labeled otherwise -- you might at first expect that only JSAPI calls that create things could GC, but in fact many operations will create gcthings internally. For example, anything that could throw a JS exception can GC, because the exception object will be allocated. (And just about anything that accepts a gcthing pointer as a parameter can throw a JS exception -- eg, if the object is a proxy for something in a different compartment, the compartment crossing may be disallowed, which would trigger an exception to be thrown.)


The rule to follow is "any gcthing pointer must be rooted if it is ever live across a call that might GC." Combined with the fact that just about any JSAPI call can GC, that is pretty close to "any gcthing pointer must be rooted if is live across a JSAPI call." To root something, store it in a JSRooted(something) type and never use the bare pointer value for anything. If you need to declare a function that accepts gcthing pointers, declare them as JSHandle(something). For out or in/out parameters, use JSMutableHandle(something). The resulting values may be used as if they were the original bare pointers for almost all purposes.
The rule to follow is "any gcthing pointer must be rooted if it is ever live across a call that might GC." Combined with the fact that just about any JSAPI call can GC, that is pretty close to "any gcthing pointer must be rooted if is live across a JSAPI call." To root something, store it in a Rooted<T> type and never use the bare pointer value for anything. If you need to declare a function that accepts gcthing pointers, declare them as Handle<T>. For out or in/out parameters, use MutableHandle<T>. The resulting values may be used as if they were the original bare pointers for almost all purposes.


The available types are:
The available types are:


   JSRootedObject / JSHandleObject / JSMutableHandleObject
   JS::Rooted<JSObject*> / JS::Handle<JSObject*> / JS::MutableHandle<JSObject*>
   JSRootedValue  / JSHandleValue  / JSMutableHandleValue
   JS::Rooted<Value> / JS::Handle<Value> / JS::MutableHandle<Value>
   JSRootedString / JSHandleString
   JS::Rooted<String> / JS::Handle<JSObject*>
   JSRootedId    / JSHandleId
   JS::Rooted<Id> / JS::Handle<JSObject*>


The canonical example is to convert something like:
The canonical example is to convert something like:
Line 54: Line 54:
to
to


   JSObject *foo(JSContext *cx, JSHandleObject obj, JSMutableHandleObject inout)
   JSObject *foo(JSContext *cx, Handle<JSObject*> obj, MutableHandle<JSObject*> inout)
   {
   {
     JSRootedObject obj1(cx, JS_Foo(obj));
     Rooted<JSObject*> obj1(cx, JS_Foo(obj));
     JSRootedObject obj2(cx, JS_Bar(obj));
     Rooted<JSObject*> obj2(cx, JS_Bar(obj));
     inout.set(obj1);
     inout.set(obj1);
     return obj2;
     return obj2;
Line 64: Line 64:
== Internals ==
== Internals ==


JSRootedObject is a typedef of Rooted<JSObject *>. Rooted<T> is an RAII class template that tells the given JSContext that the contained pointer is a root, and should be updated if a GC occurs that moves the object pointed to. The destructor will unregister that root. The registration is implemented as a simple stack (LIFO queue), which means that registrations and unregistrations must be properly nested. Given that this is an RAII class, this is almost automatic. The compiler will prevent you from using them for temporaries as in:
RootedObject is a typedef of Rooted<JSObject*>. Rooted<T> is an RAII class template that tells the given JSContext that the contained pointer is a root, and should be updated if a GC occurs that moves the object pointed to. The destructor will unregister that root. The registration is implemented as a simple stack (LIFO queue), which means that registrations and unregistrations must be properly nested. Given that this is an RAII class, this is almost automatic. The compiler will prevent you from using them for temporaries as in:


   JSRootedObject result(cx, JS_Foo(cx, JSRootedObject(cx, obj)));
   Rooted<JSObject*> result(cx, JS_Foo(cx, Rooted<JSObject*>(cx, obj)));


which would not work because the temporaries would be destroyed in the wrong order. You *can* still get into trouble by heap-allocating:
which would not work because the temporaries would be destroyed in the wrong order. You *can* still get into trouble by heap-allocating:


   JSRootedObject *obj1pointer;
   Rooted<JSObject*> *obj1pointer;
   {
   {
     JSRootedObject obj2(cx, JS_Foo());
     Rooted<JSObject*> obj2(cx, JS_Foo());
     obj1pointer = new JSRootedObject(cx, obj1);
     obj1pointer = new Rooted<JSObject*>(cx, obj1);
   }
   }
   delete obj1pointer;
   delete obj1pointer;


Heap-allocating Rooted<T> values are rarely desireable, but very occasionally necessary. If you use them, you are required to maintain the LIFO ordering manually.
A separate static analysis will detect these errors, but you'll probably have to wait for that failure to be reported on tbpl to notice. Heap-allocating Rooted<T> values are rarely desireable, but very occasionally necessary. If you use them, you are required to maintain the LIFO ordering manually.


JSHandleObject is a typedef of Handle<JSObject*>. Handle<T> is another class template that serves as an additional layer of indirection on top of a (rooted) pointer, so that if the pointer is updated, users of the Handle will automatically use the new value. In other words, JSHandleObject is really just syntactic sugar for JSObject**. If a Handle<T> escapes an enclosing Rooted<T>'s scope, Bad Things will happen -- the pointed-to pointer may be on a reused address on the stack, and so may be overwritten at any time, or a GC may occur that moves the pointer and does not update it. As long as Handle<T> is not used as a return value type, it is difficult to cause this to happen.
HandleObject is a typedef of Handle<JSObject*>. Handle<T> is another class template that serves as an additional layer of indirection on top of a (rooted) pointer, so that if the pointer is updated, users of the Handle will automatically use the new value. In other words, HandleObject is really just a JSObject** with some compilation-time checking added. If a Handle<T> escapes an enclosing Rooted<T>'s scope, Bad Things will happen -- the pointed-to pointer may be on a reused address on the stack, and so may be overwritten at any time, or a GC may occur that moves the pointer and does not update it. As long as Handle<T> is not used as a return value type, it is difficult to cause this to happen. (Ok, you could do it with a pointer or reference outparam, too.)


A JSMutableHandleObject (really MutableHandle<JSObject*>) is also a pointer to a pointer to a gcthing, but it allows updating the gcthing pointer. It is essentially a pointer to a Rooted<T>, and will implicitly convert from that. Usage is:
A MutableHandle<JSObject*> is also a pointer to a pointer to a gcthing, but it allows updating the gcthing pointer. It is essentially a pointer to a Rooted<T>, and will implicitly convert from that. Usage is:


   JSRootedObject(cx, obj);
   Rooted<JSObject*> obj(cx);
   JS_UpdateParameter(&obj);
   JS_UpdateParameter(&obj);


where JS_UpdateParameter looks like:
where JS_UpdateParameter looks like:


   void JS_UpdateParameter(JSMutableHandle obj)
   void JS_UpdateParameter(MutableHandle<JSObject*> obj)
   {
   {
     obj.set(JS_Foo(obj));
     obj.set(JS_Foo(obj));
Line 99: Line 99:
   JSObject *objects[20];
   JSObject *objects[20];


will not be traced to keep the objects live, nor will those pointers get updated on a moving GC. For local (stack-allocated) vectors, the easiest fix is to use AutoValueVector, AutoIdVector, or AutoVectorRooter<JSObject*>, which register a GC callback to trace through the vector during a GC. For heap-allocated vectors, you should convert the vectors contents to patterns of chicken entrails splattered onto a plate glass window, and infer which gcthing was intended by meditating while staring at the window while it is illuminated only by a green strobe light.
will not be traced to keep the objects live, nor will those pointers get updated on a moving GC. For local (stack-allocated) vectors, the easiest fix is to use AutoValueVector, AutoIdVector, or AutoVectorRooter<JSObject*>, which register a GC callback to trace through the vector during a GC. For heap-allocated vectors, you should convert the vectors' contents to patterns of chicken entrails splattered onto a plate glass window, and infer which gcthing was intended by meditating while staring at the window while it is illuminated only by a green strobe light.


== Instance Methods ==
== Instance Methods ==
Line 107: Line 107:
   class JSObject {
   class JSObject {
     void foo() {
     void foo() {
       RootedObject tmp(cx, JS_NewObject(...));
       Rooted<JSObject*> tmp(cx, JS_NewObject(...));
       ...do something with data members...
       ...do something with data members...
     }
     }
Line 120: Line 120:
Using a rooted value adds a layer of indirection. Creating a rooted value does a little bit of pointer chasing and appends to a stack stored in JSContext. Handles are pointers to pointers, so passing them around is no added cost. Rooted<T> is a set of 3 pointers, but so they don't get passed around that doesn't really matter.
Using a rooted value adds a layer of indirection. Creating a rooted value does a little bit of pointer chasing and appends to a stack stored in JSContext. Handles are pointers to pointers, so passing them around is no added cost. Rooted<T> is a set of 3 pointers, but so they don't get passed around that doesn't really matter.


In hot code, the overhead of rooting can be avoided if you are careful. Remember that the only thing that needs to be avoided is holding a gcthing pointer live across a call that might GC. So if a function takes a gcthing pointer but never GCs, it is valid for it to declare a bare pointer as a parameter. Even if a function might GC, it is ok to pass in a bare pointer variable as long as that variable is dead after the call. The function will have to root the value itself, if needed, but it can declare that capability by accepting a bare pointer (or JSRawObject) instead of a Handle.
In hot code, the overhead of rooting can be avoided if you are careful. Remember that the only thing that needs to be avoided is holding a gcthing pointer live across a call that might GC. So if a function takes a gcthing pointer but never GCs, it is valid for it to declare a bare pointer as a parameter. Even if a function might GC, it is ok to pass in a bare pointer variable as long as that variable is dead after the call. The function will have to root the value itself, if needed, but it can declare that capability by accepting a bare pointer instead of a Handle.


== Persistent (heap-allocated) Pointers ==
== Persistent (heap-allocated) Pointers ==
Line 126: Line 126:
The rules are different for storing pointers on the heap (i.e., into structures that have been allocated with new or malloc.) Once again, you must arrange for any pointers to be (1) discovered during tracing and (2) updated by the moving GC. For the most part, the same mechanism is used for both -- when tracing, the GC is handed an indirect pointer through which the gcthing pointer is updated if needed. Many exceptional cases may arise, however:
The rules are different for storing pointers on the heap (i.e., into structures that have been allocated with new or malloc.) Once again, you must arrange for any pointers to be (1) discovered during tracing and (2) updated by the moving GC. For the most part, the same mechanism is used for both -- when tracing, the GC is handed an indirect pointer through which the gcthing pointer is updated if needed. Many exceptional cases may arise, however:


* The gcthing pointer value may be used to construct a hash code
* The gcthing pointer value may be used to construct a hash code
* The gcthing pointer value may be tagged
* The gcthing pointer value may be tagged
* The structure containing the pointer is sometimes allocated on the stack, sometimes on the heap
* The structure containing the pointer is sometimes allocated on the stack, sometimes on the heap


Also, persistent pointer updates may be subject to write barriers for incremental and/or generational GC, where any modification must be monitored to maintain various invariants. Basically, you can't add, remove, or change a heap-stored gcthing pointer without informing the GC about it. See https://developer.mozilla.org/En/SpiderMonkey/Internals/Garbage_collection for details.
Also, persistent pointer updates may be subject to write barriers for incremental and/or generational GC, where any modification must be monitored to maintain various invariants. Basically, you can't add, remove, or change a heap-stored gcthing pointer without informing the GC about it. See https://developer.mozilla.org/En/SpiderMonkey/Internals/Garbage_collection for details.
Line 136: Line 136:
== Funky examples ==
== Funky examples ==


=== Double-rooting vs RawPointers ===
=== Double-rooting vs raw pointers ===


=== Rekeying ===
=== Rekeying ===
Confirmed users
328

edits