Jsctypes/api: Difference between revisions
(→CData objects: add address(), addressOfField(), addressOfElement().) |
(→CData objects: Add .toSource() and .toString()) |
||
| Line 227: | Line 227: | ||
:'''<code>''cdata''.constructor</code>''' - Read-only. The type of ''cdata''. ''(Implementation note: The prototype of ''cdata'' is an object that has a read-only <code>constructor</code> property, as detailed under "minutiae".)'' | :'''<code>''cdata''.constructor</code>''' - Read-only. The type of ''cdata''. ''(Implementation note: The prototype of ''cdata'' is an object that has a read-only <code>constructor</code> property, as detailed under "minutiae".)'' | ||
:'''<code>''cdata''.toSource()</code>''' - Return the string "''t''(''arg'')" where ''t'' and ''arg'' are implementation-defined JavaScript expressions (intended to represent the type of <code>''cdata''</code> and its value, respectively). The intent is that <code>eval(''cdata''.toSource())</code> should ideally produce a new <code>CData</code> object containing a copy of ''cdata'', but this can only work if the type of <code>''cdata''</code> happens to be bound to an appropriate name in scope. | |||
:'''<code>''cdata''.toString()</code>''' - Return the same string as <code>''cdata''.toSource()</code>. | |||
<code>CData</code> objects of struct types have getters and setters for each struct member: | <code>CData</code> objects of struct types have getters and setters for each struct member: | ||
Revision as of 15:21, 29 September 2009
js-ctypes is a library for calling C/C++ functions from JavaScript without having to write or generate any C/C++ "glue code".
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.
Proposal 1
1. opening a library and declaring a function
Cu.import("ctypes"); // imports the global ctypes object
// searches the path and opens "libmylib.so" on linux,
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);
// declares the C prototype int32_t myfunc(int32_t)
// Int32 implies ctypes.Int32, shortened for brevity
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32(), Int32());
let ret = myfunc(2); // calls myfunc
Note that for simple types (integers and strings), we will autoconvert the argument at call time - there's no need to pass in an Int32 object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
2. declaring and passing a simple type (by object)
let myfunc = mylib.declare("myfunc", DEFAULT, Int32, Int32);
let i = new Int32(); // instantiates an Int32 object with default value 0
let ret = myfunc(i);
An Int32 object, like all other type objects in ctypes, can be used for type specification when passed as an object, as above. declare() can look at the prototype JSObject* of its argument, and use this as a canonical JSObject representing the type, a pointer to which can be used for simple type equality comparisons. (This will work for user-defined types such as structs also - see later - though for pointer types we need to dig down to the underlying type.)
Int32() can have two modes depending on whether JS_IsConstructing(cx) is JS_TRUE ("new Int32()") or JS_FALSE ("Int32()"). Used as a function, we could perform a type conversion with range checking, for instance:
let n = Int32(4); // JSVAL_IS_INT(n) == JS_TRUE n = Int32(4e16); // RangeError - out of bounds n = Int32.max; // 2^31 - 1 // etc
For the new constructor, the resulting object stores three pieces of information internally in reserved slots. |new Int32()| creates a JSObject which allocates sizeof(int32_t) and stores that pointer in a private slot. It also stores its type, as a JSObject* pointing to the canonical Int32 prototype, and can store a parent JSObject* in case it refers to an Int32 that happens to be part of another object. Thus the slot layout of i above would be
i object:
slot 1 (parent): JSObject* -> NULL (no parent object)
slot 2 (type) : JSObject* -> Int32 prototype
slot 3 (value) : void* -> binary blob from malloc(sizeof(int32_t))
Do we need to provide an explicit set() method, to allow for efficient modification? For instance,
i.set(5); // cheaper than i = new Int32(5);
3. declaring and passing a pointer
// C prototype: int32_t myfunc(int32_t* p)
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, Pointer(Int32));
let p = new Pointer(new Int32()); // instantiates an int and a pointer
let ret = myfunc(p); // the int is an outparam
let i = p.contents(); // i = *p (by reference)
let a = p.address(); // 0x...
// same thing, but with a named integer
let i = new Int32();
let p = new Pointer(i);
let ret = myfunc(p); // modifies i
// same thing, but with a pointer temporary
let i = new Int32();
let ret = myfunc(new Pointer(i)); // modifies i
// other examples
let q = new Pointer(); // instantiate a null pointer to a void type
q = new Pointer(5); // TypeError - require a ctypes type
Internally, a pointer requires a backing object (unless it's a null pointer). In the examples, the Pointer JSObject holds a reference to the Int32 JSObject for rooting purposes, and is laid out similarly to an Int32 object:
p object:
slot 1 (parent): JSObject* -> Int32 backing object
slot 2 (type) : JSObject* -> Pointer prototype
slot 3 (value) : void* -> pointer to binary int32_t blob inside backing object
4. declaring a pointer to opaque struct
const FILE = ctypes.Struct(); // creates a Struct() type with no allocated binary storage, and no fields to access
let fopen = mylib.declare("fopen", DEFAULT_ABI, Pointer(FILE), String);
let file = fopen("foo"); // creates a new Pointer() object
file.contents(); // will throw - type is unknown
file.address(); // ok
5. declaring a struct
// C prototype: struct s_t { int32_t a; int64_t b };
const s_t = Struct([{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, s_t);
let s = new s_t(10, 20);
This creates an s_t object which allocates binary space for both fields, creates getters and setters to access the binary fields via their offset, assigns the values 10 and 20 to the fields, and whose prototype is s_t:
s object:
slot 1 (parent): JSObject* -> NULL
slot 2 (type) : JSObject* -> s_t prototype
slot 3 (value) : void* -> pointer to binary blob from malloc()
slot 4 (fields): array of data for each field:
{ JSObject* parent; JSObject* type; ptrdiff_t offset; }
The array of field information allows each field to be dependent on another JSObject (only for the case where the field is a pointer), have an associated type, and have an offset into the binary blob for ease of access.
let c = s.b; // invokes the getter for |b| to create an Int64 object like so:
c object:
slot 1 (parent): JSObject* -> s backing object
slot 2 (type) : JSObject* -> Int64 prototype
slot 3 (value) : void* -> pointer to binary int64_t blob inside backing object
let i = myfunc(s); // checks the type of s by JSObject* prototype equality
6. pointers to struct fields
let p = new Pointer(s.b);
Once the Int64 representing s.b is constructed, the Pointer object references it directly:
p object:
slot 1 (parent): JSObject* -> Int64 backing object (which, in turn, is backed by s)
slot 2 (type) : JSObject* -> Pointer prototype
slot 3 (value) : void* -> pointer to binary int64_t blob inside backing object
7. nested structs
const u_t = Struct([{ x: Int64 }, { y: s_t }]);
let u = new u_t(5e4, s); // copies data from s into u.y - no references
let u_field = u.y; // creates an s_t object that points directly to the offset of u.y within u.
const v_t = Struct([{ x: Pointer(s_t) }, { y: Pointer(s_t) }]);
let v = new v_t(new Pointer(s), new Pointer(s));
In this case, the fields array will each have their respective Pointer as the parent object, and both will point to the s binary blob.
Proposal 2
Types
A type maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
Built-in types
ctypes provides the following types:
ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t- Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).
- Since some 64-bit values are outside the range of the JavaScript number type,
ctypes.int64_tandctypes.uint64_tdo not autoconvert to JS numbers.
ctypes.size_t, ssize_t, intptr_t, uintptr_t- Primitive types whose size depends on the platform. These types do not autoconvert to JavaScript numbers because on some platforms, there are values of these types that cannot be precisely represented as a JS number.
- (Open issue: Operator overloading will eventually come to JS. JS will likely have a 64-bit integer object type someday. The above non-autoconverting behavior prevents us from later autoconverting these CTypes to 64-bit values. Maybe we should autoconvert to a crummy 64-bit number type for now, with just valueOf and toString methods, so that in the future we can compatibly upgrade to a better one.)
ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double- Types that behave like the corresponding C types. Some or all of these might be aliases for the primitive types listed above. As in C,unsignedis always an alias forunsigned_int.
- (Open issue: Does
longautoconvert to a JS number?)
ctypes.char, ctypes.signed_char, ctypes.unsigned_char- Character types that behave like the corresponding C types. (These are distinct fromint8_tanduint8_tin details of conversion behavior. For example, js-ctypes autoconverts between C characters and one-character JavaScript strings.)
ctypes.string, ustring- String types. The C/C++ type forctypes.stringisconst char *. C/C++ values of this type must be eithernullor pointers to null-terminated strings.ctypes.ustringis the same, but forconst jschar *; that is, the code units of the string areuint16_t.
ctypes.void_t- The special C typevoid. This can be used as a return value type. (voidis a keyword in JavaScript.)
ctypes.voidptr_t- The C typevoid *.
Starting from those builtin types, ctypes can create additional types:
new ctypes.PointerType(t)- If t is a ctypes type, return the type "pointer to t". If t is a string, instead return a new opaque pointer type named t. Otherwise throw aTypeError.
new ctypes.ArrayType(t)- Return an array type with unspecified length and element type t. If t is not a type ort.sizeisundefined, throw aTypeError.
new ctypes.ArrayType(t, n)- Return the array type T[n]. If t is not a type, ort.sizeisundefined, or n is not a nonnegative integer, throw aTypeError.
new ctypes.StructType(name, fields)- Create a new struct type with the given name and fields. fields is an array of field descriptors. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If name is not a string, or fields contains a field descriptor with a type t such thatt.sizeisundefined, throw aTypeError.
(Open issue: Specify a way to tell ctypes.StructType to use #pragma pack(n).)
(TODO: Finish specifying field descriptors.)
These constructors behave exactly the same way when called without the new keyword.
Examples:
const DWORD = ctypes.uint32_t;
const HANDLE = new ctypes.PointerType("HANDLE");
const HANDLES = new ctypes.ArrayType(HANDLE);
const FILE = new ctypes.PointerType("FILE *");
const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
const struct_tm = new ctypes.StructType('tm', [[ctypes.int, 'tm_sec'], ...]);
Properties of types
All the fields described here are read-only.
All types have these properties:
t.size- The C/C++sizeofthe type, in bytes.
- If t is an array type with unspecified length,
t.sizeisundefined.
ctypes.void_t.sizeisundefined.
t.name- A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
- For primitive types this is just the name of the corresponding C/C++ type, e.g.
ctypes.int32_t.name == "int32_t"andctypes.void_t == "void". But some of the builtin types are aliases for other types, so it might be thatctypes.unsigned_long.name == "uint32_t"(or something else). (Open issue: Is that too astonishing? Python ctypes does the same thing.)
- For struct types and opaque pointer types, this is simply the string that was passed to the constructor; e.g.
FILE.name == "FILE *"andstruct_tm.name == "tm". For other pointer types and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
- (Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)
t.toString()- Returns"type " + t.name.
Pointer types also have:
t.targetType- Read-only. The pointed-to type, ornullif t is an opaque pointer type.
Struct types also have:
t.fields- Read-only. A sealed array of field descriptors. (TODO: Details.)
Array types also have:
t.elementType- The type of the elements of an array of this type. E.g.IOBuf.elementType === ctypes.uint8_t.
t.length- The number of elements, a nonnegative integer.
Minutiae:
ctypes.CTypeis the abstract-base-class constructor of all js-ctypes types. If called, it throws aTypeError. (This is exposed in order to exposectypes.CType.prototype.)
- The [[Class]] of a ctypes type is
"CType".
- The [[Class]] of the type constructors
ctypes.{C,Array,Struct,Pointer}Typeis"Function".
- Every
CTypehas a read-only, permanent.prototypeproperty. The type-constructorsctypes.{C,Pointer,Struct,Array}Typeeach have a read-only, permanent.prototypeproperty as well.
- Types have a hierarchy of prototype objects. The prototype of
ctypes.CType.prototypeisFunction.prototype. The prototype ofctypes.{Array,Struct,Pointer}Type.prototypeand of all the builtin types except for the string types andctypes.voidptr_tisctypes.CType.prototype. The prototype of an array type isctypes.ArrayType.prototype. The prototype of a struct type isctypes.StructType.prototype. The prototype of a string type or pointer type isctypes.PointerType.prototype.
- Every
CTypet hast.prototype.constructor === t; that is, its.prototypehas a read-only, permanent, own.constructorproperty that refers to the type. The same is true of the four type constructorsctypes.{C,Array,Struct,Pointer}Type.
Calling types
CTypes are JavaScript constructors. That is, they are functions, and they can be called in various different ways. (CData objects are described in a separate section, below.)
new tornew t()ort()- Create a newCDataobject of type t.
- Without arguments, these allocate a new buffer of
t.sizebytes, populate it with zeroes, and return a newCDataobject referring to the complete object in that buffer.
- If
t.sizeisundefined, this throws aTypeError.
new t(val)ort(val)- Convert val to type t according to the explicit conversion rules below, throwing aTypeErrorif the conversion is impossible. Allocate a new buffer oft.sizebytes, populated with the converted value. Return a newCDataobject of type t referring to the complete object in that buffer. (When val is aCDataobject of type t, the behavior is likemallocfollowed bymemcpy.)
- As a special case, if t is an array type of unspecified length and
typeof valis'number'and val is a nonnegative integer, allocate a new buffer of sizeval * t.elementType.size. Populate it with zeroes. Return aCDataobject of type t referring to the new array.
CData objects
A CData object represents a C/C++ value located in memory. The address of the C/C++ value can be taken (using the .address() method), and it can be assigned to (using the .assign() method).
All CData objects have these methods and properties:
cdata.address()- Return a newCDataobject of the pointer typectypes.PointerType(cdata.constructor)whose value points to the C/C++ object referred to by cdata.
cdata.assign(val)- Convert val to the type of cdata using the implicit conversion rules. Store the converted value in the buffer location referred to by cdata.
cdata.constructor- Read-only. The type of cdata. (Implementation note: The prototype of cdata is an object that has a read-onlyconstructorproperty, as detailed under "minutiae".)
cdata.toSource()- Return the string "t(arg)" where t and arg are implementation-defined JavaScript expressions (intended to represent the type ofcdataand its value, respectively). The intent is thateval(cdata.toSource())should ideally produce a newCDataobject containing a copy of cdata, but this can only work if the type ofcdatahappens to be bound to an appropriate name in scope.
cdata.toString()- Return the same string ascdata.toSource().
CData objects of struct types have getters and setters for each struct member:
cstruct.member- Let F be aCDataobject referring to the struct member. ReturnConvertToJS(F).
cstruct.member = value- The value is converted to the type of the member using the implicit conversion rules. The converted value is stored in the buffer.
cstruct.addressOfField(name)- Return a newCDataobject of the appropriate pointer type, whose value points to the field of cstruct with the name name. If name is not a JavaScript string or does not name a field of cstruct, throw aTypeError.
These getters and setters can shadow the properties and methods described above. (Open issue: Can they really shadow .constructor? Maybe StructType should shoot you down if you try that one.)
Likewise, CData objects of array types have getters and setters for each element. Arrays additionally have a length property.
Note that these getters and setters are only present for integers i in the range 0 ≤ i < carray.length. (Open issue: can we arrange to throw an exception if i is out of range?)
carray[i]- Let E be aCDataobject referring to the element at index i. ReturnConvertToJS(R).
carray[i] = val- Convert val to the type of the array element using the implicit conversion rules and store the result in the buffer.
carray.length- Read-only. The length of the array.
- (Open issue: Do we care about arrays eventually having a length longer than 253, i.e. not representable as a JS number? It's currently impossible even on 64-bit platforms.)
carray.addressOfElement(i)- Return a newCDataobject of the appropriate pointer type (ctypes.PointerType(carray.constructor.elementType)) whose value points to element i of carray. If i is not a JavaScript number that is a valid index of carray, throw aTypeError.
(TODO: Figure out if the type of new FooArray(30) is FooArray or ArrayType(Foo, 30).)
(TODO: Possibly, a way to get a CData object that acts like a view on a window of an array. E.g. carray.slice(start, stop). Then you could .assign one region of memory to another, effectively memcpy-ing.)
(TODO: Pointer types might need some properties of their own.)
It is possible for multiple CData objects to refer to the same memory. (In this way they are sort of like C++ references.) For example:
const Point = new ctypes.StructType(
"Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]);
const Rect = new ctypes.StructType(
"Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]);
var r = Rect(); // a new CData object of type Rect
r.topLeft.x = 100; // This works because r.topLeft is a CData object
// that refers to the topLeft member of r, not a copy.
r.toSource()
===> "Rect({topLeft: Point({x: 100, y: 0}), bottomRight: Point({x: 0, y: 0})})"
Minutiae:
- The [[Class]] of a
CDataobject is"CData".
- The prototype of a
CDataobject is the same as its type's.prototypeproperty.
(Implementation notes: A CData object has a reserved slot that points to its type; a reserved slot that contains null if the object owns its own buffer, and otherwise points to the base CData object that owns the backing buffer where the data is stored; and a data pointer. The data pointer points to the actual location within the buffer of the C/C++ object to which the CData object refers. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it. Other CData objects that point into the buffer keep the base CData, and therefore the underlying buffer, alive.)
Conversions
The implicit conversion rules are applied whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to cdata.assign(val), or assigned to an array element or struct member, as in carray[i] = val or cstruct.member = val. These rules are intended to lose precision only when there is no reasonable alternative. They generally do not coerce values of one type to another type.
(TODO: precise rules. Some of the properties we're shooting for here are: if val is a CData object of the right type, return its C/C++ value; applying the rules to a JS number is exactly the same as applying them to the corresponding C/C++ double; applying the rules to a JS boolean is exactly the same as applying them to the corresponding C/C++ bool; plain old JS Objects can implicitly convert to C/C++ structs; plain old JS Arrays can implicitly convert to C/C++ arrays.)
The explicit conversion rules are applied when a JavaScript value is passed as a parameter when calling a type, as in t(val) or new t(val). These rules are a bit more aggressive.
(TODO: precise rules. Properties we're shooting for: if implicit conversion produces a result, explicit conversion produces the same result; in some but not all cases where a C++ typename(value) function-like cast expression would work, explicit conversion also works.)
ConvertToJS(x) - This function is used to convert a CData object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible, and a CData object otherwise. The precise rules are:
- If the value is of type
void, returnundefined.
- If the value is of type
bool, return the corresponding JavaScript boolean.
- If the value is of a number type other than the pointer-sized types and the 64-bit types, return the corresponding JavaScript number.
- If the value is of a string type and null, return
null.
- If the value is of a string type and non-null, return a JavaScript string.
- Otherwise the value is of an array, struct, or pointer type. If the argument x is already a
CDataobject, return it. Otherwise allocate a buffer containing a copy of the C/C++ value x, and return aCDataobject of the appropriate type referring to the object in the new buffer.
Note that we do not autoconvert null C/C++ pointers to the JavaScript null value.
Examples
Basic types:
let i = new ctypes.uint32_t(5); // allocate sizeof(uint32_t)==4 bytes, initialize to 5, and return a new CData object
const setint = ctypes.declare("setint", ctypes.abi.default, ctypes.void_t, ctypes.PointerType(ctypes.uint32_t));
setint(i); // implicitly passes the address of allocated buffer
- That was what I originally proposed, but now I think we should use the more explicit idiom from Python ctypes:
setint(byref(i))or (I think equivalently in this case)setint(pointer(i)). This being JavaScript I think it would be OK to change the syntax to something likesetint(i.ptr())orsetint(i.address). In fact I think it's important to provide this as a property or method because we do not expect people to do the equivalent offrom ctypes import *, andsetint(ctypes.byref(i))is a mess. --jorendorff 14:24, 29 September 2009 (UTC)
const getintp = ctypes.declare("getintp", ctypes.abi.default, ctypes.PointerType(ctypes.uint32_t));
let p = getintp(); // creates a ctypes pointer that holds the returned address
- ctypes pointers are gone. With the new language, this returns a CData object of type
uint32_t *. --jorendorff 14:24, 29 September 2009 (UTC)
let q = ctypes.castPointer(ctypes.Pointer(ctypes.uint8_t), p); // cast to uint8_t... why isn't this a method on Pointer?
- Because it's a footgun. It shouldn't be right at your fingertips, you should have to dig around for it. I think we can call this
ctypes.cast, which is what Python ctypes calls it. --jorendorff 14:24, 29 September 2009 (UTC)
let k = ctypes.pointerToUnsafeReference(q); // likewise?
- This is gone now, but I haven't looked at Python ctypes to steal a replacement for it yet. TODO. --jorendorff 14:24, 29 September 2009 (UTC)
Struct fields:
const u_t = new ctypes.StructType('u_t', [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
let u = new u_t(5, 10); // allocates sizeof(2*uint32_t) and creates a CData object
u.x = 7; // setter for u.x modifies field
let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y) -> primitive value 10
i = 5; // doesn't touch u.y
const v_t = new ctypes.StructType('v_t', [[u_t, 'u'], [ctypes.uint32_t, 'z']]);
let v = new v_t; // allocates 12 bytes, zeroes them out, and creates a CData object
let w = v.u; // ConvertToJS(reference to v.u) returns reference
w.x = 3; // invokes setter
setint(v.u.x); // TypeError - primitive is not a reference or pointer
let p = ctypes.addressOfField(v.u, 'x'); // pointer to v.u.x
setint(p); // ok - manually pass address
let q = v.u.addressOfField('x'); // abbreviated syntax?
- That makes sense to me! --jorendorff 14:24, 29 September 2009 (UTC)
64-bit integers: (check me!)
// want to represent 64-bit ints as CData objects always, rather than // autoconverting to an int/double primitive, to avoid loss of precision. // use the same behavior for size_t and ptrdiff_t. let i = new ctypes.int64_t(5); let j = i; j = 6; // invokes setter on i
- This setter trick can't be done. The user will have to do:
j.assign(6);--jorendorff 14:24, 29 September 2009 (UTC)
const m_t = new ctypes.StructType(
'm_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]);
let m = new m_t;
const setint64 = ctypes.declare("setint64", ctypes.abi.default, ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
setint64(m.x); // ok - unlike int32_t case, ConvertToJS returns a reference to the field m.x
setint64(ctypes.addressOfField(m, 'x')); // also works, per int32_t case
- Right. However I'm agitating to change this for future-compatibility; see hand-wringing near where int64_t and size_t are documented above. --jorendorff 14:24, 29 September 2009 (UTC)