Jsctypes/api
js-ctypes is a library for calling C/C++ functions from JavaScript without having to write or generate any C/C++ "glue code".
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.
Proposal 1
1. opening a library and declaring a function
Cu.import("ctypes"); // imports the global ctypes object
// searches the path and opens "libmylib.so" on linux,
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);
// declares the C prototype int32_t myfunc(int32_t)
// Int32 implies ctypes.Int32, shortened for brevity
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32(), Int32());
let ret = myfunc(2); // calls myfunc
Note that for simple types (integers and strings), we will autoconvert the argument at call time - there's no need to pass in an Int32 object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
2. declaring and passing a simple type (by object)
let myfunc = mylib.declare("myfunc", DEFAULT, Int32, Int32);
let i = new Int32(); // instantiates an Int32 object with default value 0
let ret = myfunc(i);
An Int32 object, like all other type objects in ctypes, can be used for type specification when passed as an object, as above. declare() can look at the prototype JSObject* of its argument, and use this as a canonical JSObject representing the type, a pointer to which can be used for simple type equality comparisons. (This will work for user-defined types such as structs also - see later - though for pointer types we need to dig down to the underlying type.)
Int32() can have two modes depending on whether JS_IsConstructing(cx) is JS_TRUE ("new Int32()") or JS_FALSE ("Int32()"). Used as a function, we could perform a type conversion with range checking, for instance:
let n = Int32(4); // JSVAL_IS_INT(n) == JS_TRUE n = Int32(4e16); // RangeError - out of bounds n = Int32.max; // 2^31 - 1 // etc
For the new constructor, the resulting object stores three pieces of information internally in reserved slots. |new Int32()| creates a JSObject which allocates sizeof(int32_t) and stores that pointer in a private slot. It also stores its type, as a JSObject* pointing to the canonical Int32 prototype, and can store a parent JSObject* in case it refers to an Int32 that happens to be part of another object. Thus the slot layout of i above would be
i object:
slot 1 (parent): JSObject* -> NULL (no parent object)
slot 2 (type) : JSObject* -> Int32 prototype
slot 3 (value) : void* -> binary blob from malloc(sizeof(int32_t))
Do we need to provide an explicit set() method, to allow for efficient modification? For instance,
i.set(5); // cheaper than i = new Int32(5);
3. declaring and passing a pointer
// C prototype: int32_t myfunc(int32_t* p)
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, Pointer(Int32));
let p = new Pointer(new Int32()); // instantiates an int and a pointer
let ret = myfunc(p); // the int is an outparam
let i = p.contents(); // i = *p (by reference)
let a = p.address(); // 0x...
// same thing, but with a named integer
let i = new Int32();
let p = new Pointer(i);
let ret = myfunc(p); // modifies i
// same thing, but with a pointer temporary
let i = new Int32();
let ret = myfunc(new Pointer(i)); // modifies i
// other examples
let q = new Pointer(); // instantiate a null pointer to a void type
q = new Pointer(5); // TypeError - require a ctypes type
Internally, a pointer requires a backing object (unless it's a null pointer). In the examples, the Pointer JSObject holds a reference to the Int32 JSObject for rooting purposes, and is laid out similarly to an Int32 object:
p object:
slot 1 (parent): JSObject* -> Int32 backing object
slot 2 (type) : JSObject* -> Pointer prototype
slot 3 (value) : void* -> pointer to binary int32_t blob inside backing object
4. declaring a pointer to opaque struct
const FILE = ctypes.Struct(); // creates a Struct() type with no allocated binary storage, and no fields to access
let fopen = mylib.declare("fopen", DEFAULT_ABI, Pointer(FILE), String);
let file = fopen("foo"); // creates a new Pointer() object
file.contents(); // will throw - type is unknown
file.address(); // ok
5. declaring a struct
// C prototype: struct s_t { int32_t a; int64_t b };
const s_t = Struct([{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, s_t);
let s = new s_t(10, 20);
This creates an s_t object which allocates binary space for both fields, creates getters and setters to access the binary fields via their offset, assigns the values 10 and 20 to the fields, and whose prototype is s_t:
s object:
slot 1 (parent): JSObject* -> NULL
slot 2 (type) : JSObject* -> s_t prototype
slot 3 (value) : void* -> pointer to binary blob from malloc()
slot 4 (fields): array of data for each field:
{ JSObject* parent; JSObject* type; ptrdiff_t offset; }
The array of field information allows each field to be dependent on another JSObject (only for the case where the field is a pointer), have an associated type, and have an offset into the binary blob for ease of access.
let c = s.b; // invokes the getter for |b| to create an Int64 object like so:
c object:
slot 1 (parent): JSObject* -> s backing object
slot 2 (type) : JSObject* -> Int64 prototype
slot 3 (value) : void* -> pointer to binary int64_t blob inside backing object
let i = myfunc(s); // checks the type of s by JSObject* prototype equality
6. pointers to struct fields
let p = new Pointer(s.b);
Once the Int64 representing s.b is constructed, the Pointer object references it directly:
p object:
slot 1 (parent): JSObject* -> Int64 backing object (which, in turn, is backed by s)
slot 2 (type) : JSObject* -> Pointer prototype
slot 3 (value) : void* -> pointer to binary int64_t blob inside backing object
7. nested structs
const u_t = Struct([{ x: Int64 }, { y: s_t }]);
let u = new u_t(5e4, s); // copies data from s into u.y - no references
let u_field = u.y; // creates an s_t object that points directly to the offset of u.y within u.
const v_t = Struct([{ x: Pointer(s_t) }, { y: Pointer(s_t) }]);
let v = new v_t(new Pointer(s), new Pointer(s));
In this case, the fields array will each have their respective Pointer as the parent object, and both will point to the s binary blob.
Proposal 2
Types
A type maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
Built-in types
ctypes provides the following types:
ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t- Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).
- Since some 64-bit values are outside the range of the JavaScript number type,
ctypes.int64_tandctypes.uint64_tdo not autoconvert to JS numbers.
ctypes.size_t, ssize_t, intptr_t, uintptr_t- Primitive types whose size depends on the platform. These types do not autoconvert to JavaScript numbers because on some platforms, there are values of these types that cannot be precisely represented as a JS number.
- (Open issue: Operator overloading will eventually come to JS. JS will likely have a 64-bit integer object type someday. The above non-autoconverting behavior prevents us from later autoconverting these CTypes to 64-bit values. Maybe we should autoconvert to a crummy 64-bit number type for now, with just valueOf and toString methods, so that in the future we can compatibly upgrade to a better one.)
ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double- Types that behave like the corresponding C types. Some or all of these might be aliases for the primitive types listed above. As in C,unsignedis always an alias forunsigned_int.
- (Open issue: Does
longautoconvert to a JS number?)
ctypes.char, ctypes.signed_char, ctypes.unsigned_char- Character types that behave like the corresponding C types. (These are distinct fromint8_tanduint8_tin details of conversion behavior. For example, js-ctypes autoconverts between C characters and one-character JavaScript strings.)
ctypes.string, ustring- String types. The C/C++ type forctypes.stringisconst char *. C/C++ values of this type must be eithernullor pointers to null-terminated strings.ctypes.ustringis the same, but forconst jschar *; that is, the code units of the string areuint16_t.
ctypes.void_t- The special C typevoid. This can be used as a return value type. (voidis a keyword in JavaScript.)
ctypes.voidptr_t- The C typevoid *.
Starting from those builtin types, ctypes can create additional types:
new ctypes.PointerType(t)- If t is a ctypes type, return the type "pointer to t". If t is a string, instead return a new opaque pointer type named t. Otherwise throw aTypeError.
new ctypes.ArrayType(t)- Return an array type with unspecified length and element type t. If t is not a type ort.sizeisundefined, throw aTypeError.
new ctypes.ArrayType(t, n)- Return the array type T[n]. If t is not a type, ort.sizeisundefined, or n is not a nonnegative integer, throw aTypeError.
new ctypes.StructType(name, fields)- Create a new struct type with the given name and fields. fields is an array of field descriptors. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If name is not a string, or fields contains a field descriptor with a type t such thatt.sizeisundefined, throw aTypeError.
(Open issue: Specify a way to tell ctypes.StructType to use #pragma pack(n).)
(TODO: Finish specifying field descriptors.)
These constructors behave exactly the same way when called without the new keyword.
Examples:
const DWORD = ctypes.uint32_t;
const HANDLE = new ctypes.PointerType("HANDLE");
const HANDLES = new ctypes.ArrayType(HANDLE);
const FILE = new ctypes.PointerType("FILE *");
const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
const struct_tm = new ctypes.StructType('tm', [[ctypes.int, 'tm_sec'], ...]);
Properties of types
All the fields described here are read-only.
All types have these properties:
t.size- The C/C++sizeofthe type, in bytes.
- If t is an array type with unspecified length,
t.sizeisundefined.
ctypes.void_t.sizeisundefined.
t.name- A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
- For primitive types this is just the name of the corresponding C/C++ type, e.g.
ctypes.int32_t.name == "int32_t"andctypes.void_t == "void". But some of the builtin types are aliases for other types, so it might be thatctypes.unsigned_long.name == "uint32_t"(or something else). (Open issue: Is that too astonishing?)
- For struct types and opaque pointer types, this is simply the string that was passed to the constructor; e.g.
FILE.name == "FILE *"andstruct_tm.name == "tm". For other pointer types and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
- (Open issue: This conflicts with the usual meaning of .name for functions, and types are functions.)
t.toString()- Returns"type " + t.name.
Pointer types also have:
t.targetType- The pointed-to type, ornullif t is an opaque pointer type.
Struct types also have:
t.fields- A sealed array of field descriptors, details TBD.
Array types also have:
t.elementType- The type of the elements of an array of this type. E.g.IOBuf.elementType === ctypes.uint8_t.
t.length- The number of elements, a nonnegative integer.
Minutiae:
ctypes.CTypeis the abstract-base-class constructor of all js-ctypes types. If called, it throws aTypeError. (This is exposed in order to exposectypes.CType.prototype.)
- The [[Class]] of a ctypes type is
"CType".
- The [[Class]] of the type constructors
ctypes.{C,Array,Struct,Pointer}Typeis"Function".
- Every
CTypehas a read-only, permanent.prototypeproperty. The type-constructorsctypes.{C,Pointer,Struct,Array}Typeeach have a read-only, permanent.prototypeproperty as well.
- Types have a hierarchy of prototype objects. The prototype of
ctypes.CType.prototypeisFunction.prototype. The prototype ofctypes.{Array,Struct,Pointer}Type.prototypeand of all the builtin types except for the string types andctypes.voidptr_tisctypes.CType.prototype. The prototype of an array type isctypes.ArrayType.prototype. The prototype of a struct type isctypes.StructType.prototype. The prototype of a string type or pointer type isctypes.PointerType.prototype.
- Every
CTypet hast.prototype.constructor === t; that is, its.prototypehas a read-only, permanent, own.constructorproperty that refers to the type. The same is true of the four type constructorsctypes.{C,Array,Struct,Pointer}Type.
Calling types
CTypes are JavaScript constructors. That is, they are functions, and they can be called in various different ways. (CData objects are described in a separate section, below.)
new tornew t()ort()- Create a newCDataobject of type t.
- Without arguments, these allocate a new buffer of
t.sizebytes, populate it with zeroes, and return a newCDataobject referring to the complete object in that buffer.
- If
t.sizeisundefined, this throws aTypeError.
new t(val)ort(val)- Convert val to type t according to the explicit conversion rules below, throwing aTypeErrorif the conversion is impossible. Allocate a new buffer oft.sizebytes, populated with the converted value. Return a newCDataobject of type t referring to the complete object in that buffer. (When val is aCDataobject of type t, the behavior is likemallocfollowed bymemcpy.)
- As a special case, if t is an array type of unspecified length and
typeof valis'number'and val is a nonnegative integer, allocate a new buffer of sizeval * t.elementType.size. Populate it with zeroes. Return aCDataobject of type t referring to the new array.
CData objects
A CData object represents a C/C++ value located in memory. The address of the C/C++ value can be taken, and it can be assigned to.
(TODO)
cdata.assign(val)- Convert val to the type of cdata using the implicit conversion rules. Store the converted value in the buffer location referred to by cdata.
cdata.constructor- Read-only. The type of cdata. (Implementation note: The prototype of cdata is an object that has a read-onlyconstructorproperty, as detailed under "minutiae".)
CData objects of struct types have getters and setters for each struct member:
cstruct.member- Let F be aCDataobject referring to the struct member. ReturnConvertToJS(F).
cstruct.member = value- The value is converted to the type of the member using the implicit conversion rules. The converted value is stored in the buffer.
These getters and setters can shadow the properties and methods described above. (Open issue: Can they really shadow .constructor? Maybe StructType should shoot you down if you try that one.)
Likewise, CData objects of array types have getters and setters for each element. Arrays additionally have a length property.
Note that these getters and setters are only present for integers i in the range 0 ≤ i < carray.length. (Open issue: can we arrange to throw an exception if i is out of range?)
carray[i]- Let E be aCDataobject referring to the element at index i. ReturnConvertToJS(R).
carray[i] = val- Convert val to the type of the array element using the implicit conversion rules and store the result in the buffer.
carray.length- Read-only. The length of the array.
(Open issue: Do we care about arrays eventually having a length longer than 253, i.e. not representable as a JS number? It's currently impossible even on 64-bit platforms.)
(TODO: Figure out if the type of new FooArray(30) is FooArray or ArrayType(Foo, 30).)
(TODO: Possibly, a way to get a CData object that acts like a view on a window of an array. E.g. carray.slice(start, stop). Then you could .assign one region of memory to another, effectively memcpy-ing.)
(TODO: Pointer types might need some properties of their own.)
It is possible for multiple CData objects to refer to the same memory. (In this way they are sort of like C++ references.) For example:
const Point = new ctypes.StructType(
"Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]);
const Rect = new ctypes.StructType(
"Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]);
var r = Rect(); // a new CData object of type Rect
r.topLeft.x = 100; // This works because r.topLeft is a CData object
// that refers to the topLeft member of r, not a copy.
r.toSource()
===> "Rect({topLeft: Point({x: 100, y: 0}), bottomRight: Point({x: 0, y: 0})})"
Minutiae:
- The [[Class]] of a
CDataobject is"CData".
- The prototype of a
CDataobject is the same as its type's.prototypeproperty.
(Implementation notes: A CData object has a reserved slot that points to its type; a reserved slot that points to the base CData object that owns the backing buffer where the data is stored, or null if the object owns its own buffer; and a pointer to the actual referenced location within the buffer. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it; other CData objects that point into the buffer keep the buffer alive, thanks to the reserved slot.)
Pointers
(TODO - this needs an overhaul! please disregard it for now!)
js-ctypes pointers are very simple JavaScript objects that represent C/C++ pointers. Like C/C++ pointers, js-ctypes pointers represent a memory address. They may point to valid memory, but they may also point off the end of an array, to memory that has been freed, to uninitialized or unmapped memory, or to data of a different type.
Like C/C++ pointers, js-ctypes pointers never protect the data they point to from garbage collection.
It is hard to use (non-opaque) pointers safely, so js-ctypes is designed to support as many APIs as possible without requiring the use of pointers. For example, if a C/C++ function takes a parameter that is a pointer to a struct, you can just pass it a struct, and ctypes will quietly take its address. (The implicit conversion rules will handle this.)
These functions produce pointers:
ctypes.addressOf(ref)- Return a pointer to the object referenced by ref. If ref is not a ctypes reference, throw aTypeError.
(The rest of these strike me as targets of opportunity. Only certain unusual C APIs will need them.)
ctypes.addressOfField(ref, name)- Return a pointer to the named field of the struct referenced by ref. If ref is not a reference to a struct, or the struct does not have a field with the given name, throw aTypeError.
ctypes.addressOfElement(ref, i)- Return a pointer to element i of the array referenced by ref. If ref is not a reference to an array, or i is not a valid index into the array, throw aTypeError.
ctypes.castPointer(t, ptr)- Return a pointer of type t with the same bit-value as ptr. If t is not a pointer type or ptr is neither a pointer nor an integer, throw aTypeError.
ctypes.pointerAdd(ptr, nelements)- Like the C expressionptr + nelements. Return a pointer of the same type as ptr, adjusted by nelements * targetType.size bytes. If ptr is not a pointer or nelements is not an integer, throw aTypeError.
In js-ctypes, as in C/C++, pointers are totally unchecked. There is no guaranteed-safe way to dereference a pointer. However, if the application knows that the pointer is valid, it can access the pointed-to data using this function:
ctypes.pointerToUnsafeReference(ptr)- If ptr is not a pointer, or is a null pointer, throw aTypeError. Otherwise return a reference of type t pointing to the same location as ptr. The new reference is safe to use only as long as ptr is a valid pointer. Unlike ordinary references, unsafe references do not protect the referent from garbage collection.
Pointers have the following method:
ptr.toString()- Return a string of the form"(type) 0xhexdigits"where type is the name of ptr's target type and hexdigits consists of lowercase hexidecimal digits and is exactly 8 characters on 32-bit platforms and 16 characters on 64-bit platforms.
Minutiae: The [[Class]] of a pointer is "Pointer". ctypes.Pointer is a function that takes two arguments, a pointer type and a pointer or number, and returns a new pointer object. Its .prototype property is read-only. ctypes.Pointer.prototype is a pointer. Its value is NULL. ctypes.Pointer.prototype.constructor === ctypes.Pointer. The prototype of ctypes.Pointer.prototype is Object.prototype. The prototype of every other pointer is ctypes.Pointer.prototype.
Conversions
The implicit conversion rules are applied whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to cdata.assign(val), or assigned to an array element or struct member, as in carray[i] = val or cstruct.member = val. These rules are intended to lose precision only when there is no reasonable alternative. They generally do not coerce values of one type to another type.
(TODO: precise rules. Some of the properties we're shooting for here are: if val is a CData object of the right type, return its C/C++ value; applying the rules to a JS number is exactly the same as applying them to the corresponding C/C++ double; applying the rules to a JS boolean is exactly the same as applying them to the corresponding C/C++ bool; plain old JS Objects can implicitly convert to C/C++ structs; plain old JS Arrays can implicitly convert to C/C++ arrays.)
The explicit conversion rules are applied when a JavaScript value is passed as a parameter when calling a type, as in t(val) or new t(val). These rules are a bit more aggressive.
(TODO: precise rules. Properties we're shooting for: if implicit conversion produces a result, explicit conversion produces the same result; in some but not all cases where a C++ typename(value) function-like cast expression would work, explicit conversion also works.)
ConvertToJS(x) - This function is used to convert a CData object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible, and a CData object otherwise. The precise rules are:
- If the value is of type
void, returnundefined.
- If the value is of type
bool, return the corresponding JavaScript boolean.
- If the value is of a number type other than the pointer-sized types and the 64-bit types, return the corresponding JavaScript number.
- If the value is of a string type and null, return
null.
- If the value is of a string type and non-null, return a JavaScript string.
- Otherwise the value is of an array, struct, or pointer type. If the argument x is already a
CDataobject, return it. Otherwise allocate a new buffer of the appropriate size, populate it with the C/C++ value x, and return a ctypes reference to the complete object in the new buffer.
Note that we do not autoconvert null C/C++ pointers to the JavaScript null value.
Examples
Basic types:
let i = new ctypes.uint32_t(5); // allocate sizeof(uint32_t) bytes, initialize to 5, and return a ctypes reference
const setint = ctypes.declare("setint", ctypes.abi.default, ctypes.void_t, ctypes.PointerType(ctypes.uint32_t));
setint(i); // implicitly passes the address of allocated buffer
const getintp = ctypes.declare("getintp", ctypes.abi.default, ctypes.PointerType(ctypes.uint32_t));
let p = getintp(); // creates a ctypes pointer that holds the returned address
let q = ctypes.castPointer(ctypes.Pointer(ctypes.uint8_t), p); // cast to uint8_t... why isn't this a method on Pointer?
let k = ctypes.pointerToUnsafeReference(q); // likewise?
Struct fields:
const u_t = new ctypes.StructType('u_t', [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
let u = new u_t(5, 10); // allocates sizeof(2*uint32_t) and creates ctypes reference
u.x = 7; // setter for u.x modifies field
let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y) -> primitive value 10
i = 5; // doesn't touch u.y
const v_t = new ctypes.StructType('v_t', [[u_t, 'u'], [ctypes.uint32_t, 'z']]);
let w = v.u; // ConvertToJS(reference to v.u) returns reference
w.x = 3; // invokes setter
setint(v.u.x); // TypeError - primitive is not a reference or pointer
let p = ctypes.addressOfField(v.u, 'x'); // pointer to v.u.x
setint(p); // ok - manually pass address
let q = v.u.addressOfField('x'); // abbreviated syntax?
64-bit integers: (check me!)
// want to represent 64-bit ints as references always, rather than
// autoconverting to an int/double primitive, to avoid loss of precision.
// use the same behavior for size_t and ptrdiff_t.
let i = new ctypes.int64_t(5);
let j = i;
j = 6; // invokes setter on i
const m_t = new ctypes.StructType('m_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]);
let m = new m_t(7);
const setint64 = ctypes.declare("setint64", ctypes.abi.default, ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
setint64(m.x); // ok - unlike int32_t case, ConvertToJS returns a reference to the field m.x
setint64(ctypes.addressOfField(m, 'x')); // also works, per int32_t case