Jsctypes/api: Difference between revisions
(→Int64: clarify that hex strings require 0x) |
|||
| Line 361: | Line 361: | ||
== Int64 == | == Int64 == | ||
:'''<code>ctypes.Int64(''n'')</code>''' or '''<code>new ctypes.Int64(''n'')</code>''' - If ''n'' is an integer-valued number such that -2<sup>63</sup> ≤ ''n'' < 2<sup>63</sup>, return a sealed <code>Int64</code> object with that value. Otherwise if ''n'' is a decimal or hexadecimal string | :'''<code>ctypes.Int64(''n'')</code>''' or '''<code>new ctypes.Int64(''n'')</code>''' - If ''n'' is an integer-valued number such that -2<sup>63</sup> ≤ ''n'' < 2<sup>63</sup>, return a sealed <code>Int64</code> object with that value. Otherwise if ''n'' is a string consisting of an optional minus sign followed by either decimal digits or <code>"0x"</code> or <code>"0X"</code> and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct an <code>Int64</code> object as above. Otherwise if ''n'' is an <code>Int64</code> or <code>UInt64</code> object, and represents a number within range, use the value to construct an <code>Int64</code> object as above. Otherwise throw a <code>TypeError</code>. | ||
<code>Int64</code> objects have the following methods: | <code>Int64</code> objects have the following methods: | ||
Revision as of 19:23, 31 December 2009
js-ctypes is a library for calling C/C++ functions from JavaScript without having to write or generate any C/C++ "glue code".
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.
Libraries
ctypes.open(name)- Open a library. (TODO: all the details) This always returns aLibraryobject or throws an exception.
Library objects have the following methods:
lib.declare(name, abi, rtype, [argtype1, ...])- Declare a function. (TODO: all the details) This always returns a new function object or throws an exception.
- If rtype is an array type, this throws a
TypeError.
- If any argtypeN is an array type, the result is the same as if it had been the corresponding pointer type,
argtypeN.elementType.ptr. (Rationale: This is how C and C++ treat array types in function declarations.)
(TODO: Explain what happens when you call a declared function. In brief: It uses ImplicitConvert to convert the JavaScript arguments to C and ConvertToJS to convert the return value to JS.)
Types
A type maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
(Types and their prototypes are extensible: scripts can add new properties to them. Rationale: This is how most JavaScript constructors behave.)
Built-in types
ctypes provides the following types:
ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t- Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).
- Since some 64-bit values are outside the range of the JavaScript number type,
ctypes.int64_tandctypes.uint64_tdo not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper typesctypes.Int64andctypes.UInt64(which are JavaScript object types, notCTypes). See "64-bit integer objects" below.
ctypes.size_t, ssize_t, intptr_t, uintptr_t- Primitive types whose size depends on the platform. (These types do not autoconvert to JavaScript numbers. Instead they convert to wrapper objects, even on 32-bit platforms. See "64-bit integer objects" below. Rationale: On 64-bit platforms, there are values of these types that cannot be precisely represented as JS numbers. It will be easier to write code that works on multiple platforms if the builtin types autoconvert in the same way on all platforms.)
ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double- Types that behave like the corresponding C types. As in C,unsignedis always an alias forunsigned_int.
- (
ctypes.longandctypes.unsigned_longautoconvert to 64-bit integer objects on all platforms. The rest autoconvert to JavaScript numbers. Rationale: Some platforms have 64-bitlongand some do not.)
ctypes.char, ctypes.signed_char, ctypes.unsigned_char- Character types that behave like the corresponding C types. (These are distinct fromint8_tanduint8_tin details of conversion behavior. For example, js-ctypes autoconverts between C characters and one-character JavaScript strings.)
ctypes.jschar- A 16-bit unsigned character type. (This is distinct fromuint8_tin details of conversion behavior. js-ctypes autoconverts Cjschars to JavaScript strings of length 1.)
ctypes.void_t- The special C typevoid. This can be used as a return value type. (voidis a keyword in JavaScript.)
ctypes.voidptr_t- The C typevoid *.
The wrapped integer types are the types int64_t, uint64_t, size_t, ssize_t, intptr_t, uintptr_t, long, and unsigned_long. These are the types that autoconvert to 64-bit integer objects rather than to primitive JavaScript numbers.
User-defined types
Starting from the builtin types above, these functions can be used to create additional types:
new ctypes.PointerType(t)- If t is aCType, return the type "pointer to t". The result is cached so that future requests for this pointer type produce the sameCTypeobject. If t is a string, instead return a new opaque pointer type named t. Otherwise throw aTypeError.
new ctypes.ArrayType(t)- Return an array type with unspecified length and element type t. If t is not a type ort.sizeisundefined, throw aTypeError.
new ctypes.ArrayType(t, n)- Return the array type t[n]. If t is not a type ort.sizeisundefinedor n is not a size value (defined below), throw aTypeError. If the size of the resulting array type, in bytes, would not be exactly representable both as asize_tand as a JavaScript number, throw aRangeError.
- A size value is either a non-negative, integer-valued primitive number, an
Int64object with a non-negative value, or aUInt64object.
- (Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)
new ctypes.StructType(name, fields)- Create a new struct type with the given name and fields. fields is an array of field descriptors, of the format
[ { field1: type1 }, { field2: type2 }, ... ]
- where
fieldnis a string denoting the name of the field, andtypenis a ctypes type. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If name is not a string, or anytypenis such thattypen.sizeisundefined, throw aTypeError. If the size of the struct, in bytes, would not be exactly representable both as asize_tand as a JavaScript number, throw aRangeError.
(Open issue: Specify a way to tell ctypes.StructType to use #pragma pack(n).)
These constructors behave exactly the same way when called without the new keyword.
Examples:
const DWORD = ctypes.uint32_t;
const HANDLE = new ctypes.PointerType("HANDLE");
const HANDLES = new ctypes.ArrayType(HANDLE);
const FILE = new ctypes.PointerType("FILE *");
const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
const struct_tm = new ctypes.StructType('tm', [{'tm_sec': ctypes.int}, ...]);
Properties of types
All the fields described here are read-only.
All types have these properties and methods:
t.size- The C/C++sizeofthe type, in bytes. The result is a primitive number, not aUInt64object.
- If t is an array type with unspecified length,
t.sizeisundefined.
ctypes.void_t.sizeisundefined.
t.name- A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
- For primitive types this is just the name of the corresponding C/C++ type.
- For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other pointer types and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
- (Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)
ctypes.int32_t.name
===> "int32_t"
ctypes.void_t.name
===> "void"
ctypes.jschar.ptr.name
===> "jschar *"
const FILE = new ctypes.PointerType("FILE *");
FILE.name
===> "FILE *"
const struct_tm = new ctypes.StructType("tm", [{tm_sec: ctypes.int}, ...]);
struct_tm.name
===> "tm"
// Pointer-to-array types are not often used in C/C++.
// Such types have funny-looking names.
const ptrTo_ptrTo_arrayOf4_strings =
new ctypes.PointerType(
new ctypes.PointerType(
new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4)));
ptrTo_ptrTo_arrayOf4_strings.name
===> "char *(**)[4]"
t.ptr- Returnctypes.PointerType(t).
t.array()- Returnctypes.ArrayType(t).
t.array(n)- Returnctypes.ArrayType(t, n).
- Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;
- (
.array()requires parentheses but.ptrdoesn't. Rationale:.array()has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:int [10]-->ctypes.int.array(10). Writing a pointer type does not require the brackets.)
t.toString()- Return"type " + t.name.
t.toSource()- Return a JavaScript expression that evaluates to aCTypedescribing the same C/C++ type as t.
ctypes.uint32_t.toSource()
===> "ctypes.uint32_t"
ctypes.string.toSource()
===> "ctypes.string"
const charPtr = new ctypes.PointerType(ctypes.char);
charPtr.toSource()
===> "ctypes.char.ptr"
const Point = new ctypes.StructType(
"Point", [{x: ctypes.int32_t}, {y: ctypes.int32_t}]);
Point.toSource()
===> "ctypes.StructType("Point", [{x: ctypes.int32_t}, {y: ctypes.int23_t}])"
Pointer types also have:
t.targetType- Read-only. The pointed-to type, ornullif t is an opaque pointer type.
Struct types also have:
t.fields- Read-only. A sealed array of field descriptors. (TODO: Details.)
Array types also have:
t.elementType- The type of the elements of an array of this type. E.g.IOBuf.elementType === ctypes.uint8_t.
t.length- The number of elements, a non-negative integer; orundefinedif this is an array type with unspecified length. (The result, if notundefined, is a primitive number, not aUInt64object. Rationale: Having.lengthproduce anything other than a number is foreign to JS, and arrays of more than 253 elements are currently unheard-of.)
Minutiae:
ctypes.CTypeis the abstract-base-class constructor of all js-ctypes types. If called, it throws aTypeError. (This is exposed in order to exposectypes.CType.prototype.)
- The [[Class]] of a ctypes type is
"CType".
- The [[Class]] of the type constructors
ctypes.{C,Array,Struct,Pointer}Typeis"Function".
- Every
CTypehas a read-only, permanent.prototypeproperty. The type-constructorsctypes.{C,Pointer,Struct,Array}Typeeach have a read-only, permanent.prototypeproperty as well.
- Types have a hierarchy of prototype objects. The prototype of
ctypes.CType.prototypeisFunction.prototype. The prototype ofctypes.{Array,Struct,Pointer}Type.prototypeand of all the builtin types exceptctypes.voidptr_tisctypes.CType.prototype. The prototype of an array type isctypes.ArrayType.prototype. The prototype of a struct type isctypes.StructType.prototype. The prototype of a pointer type isctypes.PointerType.prototype.
- Every
CTypet hast.prototype.constructor === t; that is, its.prototypehas a read-only, permanent, own.constructorproperty that refers to the type. The same is true of the four type constructorsctypes.{C,Array,Struct,Pointer}Type.
Calling types
CTypes are JavaScript constructors. That is, they are functions, and they can be called to create new objects. (The objects they create are called CData objects, and they are described in the next section.)
new tornew t()ort()- Create a newCDataobject of type t.
- Without arguments, these allocate a new buffer of
t.sizebytes, populate it with zeroes, and return a newCDataobject referring to the complete object in that buffer.
- If
t.sizeisundefined, this throws aTypeError.
new t(val)ort(val)- Create a newCDataobject as follows:
- If
t.sizeis notundefined: Convert val to type t by callingExplicitConvert(val, t), throwing aTypeErrorif the conversion is impossible. Allocate a new buffer oft.sizebytes, populated with the converted value. Return a newCDataobject of type t referring to the complete object in that buffer. (When val is aCDataobject of type t, the behavior is likemallocfollowed bymemcpy.)
- If
- If t is an array type of unspecified length:
- If val is a size value (defined above): Let u =
ArrayType(t.elementType, val)and returnnew u.
- If val is a size value (defined above): Let u =
- If
t.elementTypeisjscharand val is a string: Return a newCDataobject of typeArrayType(ctypes.jschar, val.length + 1)containing the contents of val followed by a null character.
- If
- If
t.elementTypeis an 8-bit character type and val is a string: If val is not a well-formed UTF-16 string, throw aTypeError. Otherwise, let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8, and let n = the number of bytes in s. Return a newCDataobject of typeArrayType(t.elementType, n + 1)containing the bytes in s followed by a null character.
- If
- If val is a JavaScript array object and
val.lengthis a nonnegative integer, let u =ArrayType(t.elementType, val.length)and returnnew u(val). (ArrayCDataobjects created in this way havecdata.constructor === u, not t. Rationale: For allCDataobjects,cdata.constructor.sizegives the size in bytes, unless a struct field shadowscdata.constructor.)
- If val is a JavaScript array object and
- Otherwise, throw a
TypeError.
- Otherwise, throw a
- Otherwise, t is
void_t. Throw aTypeError.
- Otherwise, t is
let a_t = ctypes.ArrayType(ctypes.int32_t); let a = new a_t(5); a.length ===> 5 a.constructor.size ===> 20
CData objects
A CData object represents a C/C++ value located in memory. The address of the C/C++ value can be taken (using the .address() method), and it can be assigned to (using the .value property).
Every CData object has a type, the CType object that describes the type of the C/C++ value.
Minutiae:
- The [[Class]] of a
CDataobject is"CData".
- The prototype of a
CDataobject is the same as its type's.prototypeproperty.
(Implementation notes: A CData object has a reserved slot that points to its type; a reserved slot that contains null if the object owns its own buffer, and otherwise points to the base CData object that owns the backing buffer where the data is stored; and a data pointer. The data pointer points to the actual location within the buffer of the C/C++ object to which the CData object refers. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it. Other CData objects that point into the buffer keep the base CData, and therefore the underlying buffer, alive.)
Properties and methods of CData objects
All CData objects have these methods and properties:
cdata.address()- Return a newCDataobject of the pointer typectypes.PointerType(cdata.constructor)whose value points to the C/C++ object referred to by cdata.
- (Open issue: Does this pointer keep cdata alive? Currently not but we could easily change it. It is impossible to have all pointers keep their referents alive in a totally general way--consider pointers embedded in structs and arrays. But this special case would be pretty easy to hack: put a
.contentsproperty on the resulting pointer, referring back to cdata.)
cdata.constructor- Read-only. The type of cdata. (This is nevervoid_tor an array type with unspecified length. Implementation note: The prototype of cdata is an object that has a read-onlyconstructorproperty, as detailed under "minutiae".)
cdata.toSource()- Return the string "t(arg)" where t and arg are implementation-defined JavaScript expressions (intended to represent the type ofcdataand its value, respectively). The intent is thateval(cdata.toSource())should ideally produce a newCDataobject containing a copy of cdata, but this can only work if the type ofcdatahappens to be bound to an appropriate name in scope.
cdata.toString()- Return the same string ascdata.toSource().
The .value property has a getter and a setter:
cdata.value- Let x =ConvertToJS(cdata). Ifx === cdata, throw aTypeError. Otherwise return x.
cdata.value = val- Let cval =ImplicitConvert(val, cdata.constructor). If conversion fails, throw aTypeError. Otherwise assign the value cval to the C/C++ object referred to by cdata.
Structs
CData objects of struct types also have this method:
cstruct.addressOfField(name)- Return a newCDataobject of the appropriate pointer type, whose value points to the field of cstruct with the name name. If name is not a JavaScript string or does not name a member of cstruct, throw aTypeError.
They also have getters and setters for each struct member:
cstruct.member- Let F be aCDataobject referring to the struct member. ReturnConvertToJS(F).
cstruct.member = val- Let cval =ImplicitConvert(val, the type of the member). If conversion fails, throw aTypeError. Otherwise store cval in the appropriate member of the struct.
These getters and setters can shadow the properties and methods described above.
Pointers
CData objects of pointer types also have this property:
cptr.contents- Let C be aCDataobject referring to the pointed-to contents of cptr. ReturnConvertToJS(C).
cptr.contents = val- Let cval =ImplicitConvert(val, the base type of the pointer). If conversion fails, throw aTypeError. Otherwise store cval in the pointed-to contents of cptr.
Arrays
Likewise, CData objects of array types have getters and setters for each element. Arrays additionally have a length property.
Note that these getters and setters are only present for integers i in the range 0 ≤ i < carray.length. (Open issue: can we arrange to throw an exception if i is out of range?)
carray[i]- Let E be aCDataobject referring to the element at index i. ReturnConvertToJS(E).
carray[i] = val- Let cval =ImplicitConvert(val, carray.elementType). If conversion fails, throw aTypeError. Otherwise store cval in element i of the array.
carray.length- Read-only. The length of the array as a JavaScript number. (The same ascarray.constructor.length. This is not aUInt64object. Rationale: ArrayCDataobjects should behave like other array-like objects for easy duck typing.)
carray.addressOfElement(i)- Return a newCDataobject of the appropriate pointer type (ctypes.PointerType(carray.constructor.elementType)) whose value points to element i of carray. If i is not a JavaScript number that is a valid index of carray, throw aTypeError.
(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)
Aliasing
Note that it is possible for several CData objects to refer to the same or overlapping memory. (In this way CData objects are like C++ references.) For example:
const Point = new ctypes.StructType(
"Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]);
const Rect = new ctypes.StructType(
"Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]);
var r = Rect(); // a new CData object of type Rect
var p = r.topLeft; // refers to the topLeft member of r, not a copy
r.topLeft.x = 100; // This would not work if `r.topLeft` was a copy!
r.topLeft.x
===> 100 // It works...
p.x // and p refers to the same C/C++ object...
===> 100 // so it sees the change as well.
r.toSource()
===> "Rect({topLeft: {x: 100, y: 0}, bottomRight: {x: 0, y: 0}})"
p.x = 1.0e90; // Assigning a value out of range is an error.
**** TypeError
// The range checking is great, but it can have surprising
// consequences sometimes:
p.x = 0x7fffffff; // (the maximum int32_t value)
p.x++; // p.x = 0x7fffffff + 1, which is out of range...
**** TypeError // ...so this fails, leaving p.x unchanged.
// But JS code doesn't need to do that very often.
// To make this to roll around to -0x80000000, you could write:
p.x = (p.x + 1) | 0; // In JS, `x|0` truncates a number to int32.
Casting
ctypes.cast(cdata, t)- Return a newCDataobject which points to the same memory block as cdata, but with type t. Ift.sizeis undefined or larger thancdata.constructor.size, throw aTypeError. This is like a C cast or a C++reinterpret_cast.
Equality
According to the ECMAScript standard, if x and y are two different objects, then x === y and x == y are both false. This has consequences for code that uses js-ctypes pointers, pointer-sized integers, or 64-bit integers, because all these values are represented as JavaScript objects. In C/C++, the == operator would compare values of these types for equality. Not so in js-ctypes:
const HANDLE = new ctypes.PointerType("HANDLE");
const INVALID_HANDLE_VALUE = HANDLE(-1);
const kernel32 = ctypes.open("kernel32");
const CreateMutex = kernel32.declare("CreateMutex", ...);
var h = CreateMutex(null, false, null);
if (h == INVALID_HANDLE_VALUE) // BAD - always false
...
This comparison is always false because CreateMutex returns a new CData object, which of course will be a different object from the existing value of INVALID_HANDLE_VALUE.
(Python ctypes has the same issue. It isn't mentioned in the docs, but:
>>> from ctypes import * >>> c_void_p(0) == c_void_p(0) False >>> c_int(33) == c_int(33) False
We could overload operator== using the nonstandard hook JSExtendedClass.equality but it might not be worth it.)
64-bit integer objects
Since JavaScript numbers are floating-point values, they cannot precisely represent all 64-bit integer values. Therefore 64-bit and pointer-sized C/C++ values of numeric types do not autoconvert to JavaScript numbers. Instead they autoconvert to JavaScript objects of type ctypes.Int64 and ctypes.UInt64.
Int64 and UInt64 objects are immutable.
It's not possible to do arithmetic Int64Objects using the standard arithmetic operators. JavaScript does not have operator overloading (yet). A few convenience functions are provided. (These types are intentionally feature-sparse so that they can be drop-in-replaced with a full-featured bignum type when JavaScript gets one.)
Int64
ctypes.Int64(n)ornew ctypes.Int64(n)- If n is an integer-valued number such that -263 ≤ n < 263, return a sealedInt64object with that value. Otherwise if n is a string consisting of an optional minus sign followed by either decimal digits or"0x"or"0X"and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct anInt64object as above. Otherwise if n is anInt64orUInt64object, and represents a number within range, use the value to construct anInt64object as above. Otherwise throw aTypeError.
Int64 objects have the following methods:
i64.toString([radix])- If radix is omitted, assume 10. Return a string representation of a in base radix, consisting of a leading minus sign, if the value is negative, followed by one or more lowercase digits in base radix.
i64.toSource()- Return a string. (This is provided for debugging purposes, and programs should not rely on details of the resulting string, which may change in the future.)
The following functions are also provided:
ctypes.Int64.compare(a, b)- If a and b are bothInt64objects, return-1if a < b,0if a = b, and1if a > b. Otherwise throw aTypeError.
ctypes.Int64.lo(a)- If a is anInt64object, return the low 32 bits of its value. (The result is an integer in the range 0 ≤ result < 232.) Otherwise throw aTypeError.
ctypes.Int64.hi(a)- If a is anInt64object, return the high 32 bits of its value (likea >> 32). Otherwise throw aTypeError.
ctypes.Int64.join(hi, lo)- If hi is an integer-valued number in the range -231 ≤ hi < 231 and lo is an integer-valued number in the range 0 ≤ lo < 232, return a sealedInt64object whose value is hi × 232 + lo. Otherwise throw aTypeError.
UInt64
UInt64 objects are the same except that the hi values are in the range 0 ≤ hi < 232 and the .toString() method never produces a minus sign.
Conversions
These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.
ConvertToJS(x) - This function is used to convert a CData object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible without loss of data or different behavior on different platforms, and a CData object otherwise. The precise rules are:
- If the type of x is
void, returnundefined.
- If the type of x is
bool, return the corresponding JavaScript boolean.
- If x is of a number type but not a wrapped integer type, return the corresponding JavaScript number.
- If x is a signed wrapped integer type (
long,int64_t,ssize_t, orintptr_t), return actypes.Int64object with value x.
- If x is an unsigned wrapped integer type (
unsigned long,uint64_t,size_t, oruintptr_t), return actypes.UInt64object with value x.
- If x is of type
jschar, return a JavaScript string of length 1 containing the value of x (likeString.fromCharCode(x)).
- If x is of any other character type, select the corresponding Unicode character. (Open issue: Unicode conversions.) Convert the character to UTF-16. Return a JavaScript string containing the UTF-16 code units. (If the character type is 1 byte with each value mapping to a Unicode BMP character, the result is a one-character JavaScript string.) (Note: If we ever support
wchar_t, it might be best to autoconvert it to a number. On platforms wherewchar_tis 32 bits, values over0x10ffffare not Unicode characters.)
- Otherwise x is of an array, struct, or pointer type. If the argument x is already a
CDataobject, return it. Otherwise allocate a buffer containing a copy of the C/C++ value x, and return aCDataobject of the appropriate type referring to the object in the new buffer.
Note that null C/C++ pointers do not convert to the JavaScript null value. (Open issue: Should we? Is there any value in retaining the type of a particular null pointer?)
(Arrays of characters do not convert to JavaScript strings. Rationale: Suppose x is a CData object of a struct type with a member a of type char[10]. Then x.a[1] should return the character in element 1 of the array, even if x.a[0] is a null character. Likewise, x.a[0] = '\0'; should modify the contents of the array. Both are possible only if x.a is a CData object of array type, not a JavaScript string.)
ImplicitConvert(val, t) - Convert the JavaScript value val to a C/C++ value of type t. This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to cdata.value = val, or assigned to an array element or struct member, as in carray[i] = val or cstruct.member = val.
This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.
C/C++ values of all supported types round trip through ConvertToJS and ImplicitConvert without any loss of data. That is, for any C/C++ value v of type t, ImplicitConvert(ConvertToJS(v), t) produces a copy of v. (Note that not all JavaScript can round-trip to C/C++ and back in an analogous way. JavaScript primitive numbers can round-trip to double on all current platforms, Int64 objects to int64_t, JavaScript booleans to bool, and so on. But some JavaScript values, such as functions, cannot be ImplicitConverted to any C/C++ type without loss of data.)
t must not be void or an array type with unspecified length. (Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared int f(int x[]) is int *, not int[].)
- First, if val is a
CDataobject of type u andSameType(t, u), return the current value of the C/C++ object referred to by val. Otherwise the behavior depends on the target type t.
- If t is
ctypes.bool:
- If val is a boolean, return the corresponding C/C++ boolean value.
- If val is the number +0 or -0, return
false. - If val is the number 1, return
true. - Otherwise fail.
- If t is a numeric type:
- If val is a boolean, the result is a 0 or 1 of type t.
- If val is a
CDataobject of a numeric type, and every value of that type is precisely representable in type t, the result is a precise representation of the value of val in type t. (This is more conservative than the implicit integer conversions in C/C++ and more conservative than what we do if val is a JavaScript number. This is sensitive to the signedness of the two types.) - If val is a number that can be exactly represented as a value of type t, the result is that value.
- If val is an
Int64orUInt64object whose value can be exactly represented as a value of type t, the result is that value. - If val is a number and t is a floating-point type, the result is the
jsdoublerepresented by val, cast to type t. (This can implicitly lose bits of precision. The rationale is to allow the user to pass values like 1/3 tofloatparameters.) - Otherwise fail.
- If t is
ctypes.jschar:
- If val is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string.
val.charCodeAt(0). - If val is a number that can be exactly represented as a value of type
jschar(that is, an integer in the range 0 ≤ val < 216), the result is that value. - Otherwise fail.
- If val is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string.
- If t is any other character type:
- If val is a string:
- If the 16-bit elements of val are not the UTF-16 encoding of a single Unicode character, fail. (Open issue: If we support
wchar_twe may want to allow unpaired surrogate code points to pass through without error.) - If that Unicode character can be represented by a single character of type t, the result is that character. (Open issue: Unicode conversions.)
- Otherwise fail.
- If the 16-bit elements of val are not the UTF-16 encoding of a single Unicode character, fail. (Open issue: If we support
- If val is a number that can be exactly represented as a value of type t, the result is that value. (This is sensitive to the signedness of t.)
- Otherwise fail.
- If t is a pointer type:
- If val is
null, the result is a C/C++NULLpointer of type t. - If val is a
CDataobject of array type u and either t isctypes.voidptr_torSameType(t.targetType, u.elementType), return a pointer to the first element of the array. - If t is
ctypes.voidptr_tand val is aCDataobject of pointer type, return the value of the C/C++ pointer in val, cast tovoid *. - Otherwise fail. (Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)
- If val is
- If t is an array type:
- If val is a JavaScript string:
- If
t.elementTypeisjscharandt.length >= val.length, the result is an array of type t whose firstval.lengthelements are the 16-bit elements of val. Ift.length > val.length, then elementval.lengthof the result is a null character. The values of the rest of the array elements are unspecified. - If
t.elementTypeis an 8-bit character type:
- If t is not well-formed UTF-16, fail.
- Let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8.
- Let n = the number of bytes in s.
- If
t.length < n, fail. - The result is an array of type t whose first n elements are the 8-bit values in s. If
t.length > n, then element n of the result is 0. The values of the rest of the array elements are unspecified.
- Otherwise fail.
- If
- If val is a JavaScript array object:
- If
val.lengthis not a nonnegative integer, fail. - If
val.length !== t.length, fail. - Otherwise, the result is a C/C++ array of
val.lengthelements of typet.elementType. Element i of the result isImplicitConvert(val[i], t.elementType).
- If
- Otherwise fail.
- Otherwise t is a struct type.
- If val is a JavaScript object that is not a
CDataobject:
- If the enumerable own properties of val are exactly the names of the members of the struct t, the result is a C/C++ struct of type t, each of whose members is
ImplicitConvert(val[the member name], the type of the member). - Otherwise fail.
- If the enumerable own properties of val are exactly the names of the members of the struct t, the result is a C/C++ struct of type t, each of whose members is
- Otherwise fail.
- If val is a JavaScript object that is not a
ExplicitConvert(val, t) - Convert the JavaScript value val to a C/C++ value of type t, a little more forcefully than ImplicitConvert.
This is called when a JavaScript value is passed as a parameter when calling a type, as in t(val) or new t(val).
- If
ImplicitConvert(val, t)succeeds, use that result. Otherwise:
- If t is
ctypes.bool, the result is the C/C++ boolean value corresponding toToBoolean(val), where the operatorToBooleanis as defined in the ECMAScript standard. (This is a bit less strict than the conversion behavior specified for numeric types below. This is just for convenience: the operators&&and||, which produce a boolean value in C/C++, do not always do so in JavaScript.)
- If t is an integer or character type and val is an infinity or NaN, the result is a 0 of type t.
- If t is an integer or character type and val is a finite number, the result is the same as casting the
jsdoublevalue of val to type t with a C-style cast. (I think this basically means, start with val, discard the fractional part, convert the integer part to a bit-pattern, and mask off whatever doesn't fit in type t. But whatever C does is good enough for me. --jorendorff)
- If t is an integer or character type and val is an
Int64orUInt64object, the result is the same as casting theint64_toruint64_tvalue of val to type t with a C-style cast.
- If t is a pointer type and val is a number,
Int64object, orUInt64object that can be exactly represented as anintptr_toruintptr_t, the result is the same as casting thatintptr_toruintptr_tvalue to type t with a C-style cast.
- If t is a wrapped integer type, and val is a string consisting entirely of an optional minus sign, followed by the characters "0x" or "0X", followed by one or more hexadecimal digits, then the result is the same as casting the number named by val to type t with a C-style cast.
- Otherwise fail.
SameType(t, u) - True if t and u represent the same C/C++ type.
- If t and u represent the same built-in type, even
void, return true. - If they are both pointer types, return
SameType(t.targetType, u.targetType). - If they are both array types, return
SameType(t.elementType, u.elementType) && t.length === u.length. - If they are both struct types, return
t === u. - Otherwise return false.
(SameType(int, int32_t) is false. Rationale: As it stands, SameType behaves the same on all platforms. By making types match if they are typedef'd on the current platform, we could make e.g. ctypes.int.ptr and ctypes.int32_t.ptr compatible on platforms where we just have typedef int int32_t. But it was unclear how much that would matter in practice, balanced against cross-platform consistency. We might reverse this decision.)
Examples
Cu.import("ctypes"); // imports the global ctypes object
// searches the path and opens "libmylib.so" on linux,
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);
// declares the C function:
// int32_t myfunc(int32_t);
let myfunc = mylib.declare("myfunc", ctypes.default_abi,
ctypes.int32_t, ctypes.int32_t);
let ret = myfunc(2); // calls myfunc
Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a ctypes.int32_t object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
Here is how to create an object of type int32_t:
let i = new ctypes.int32_t; // new int32_t object with default value 0
This allocates a new C++ object of type int32_t (4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.
(Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you.)
let myfunc = mylib.declare("myfunc", ctypes.default_abi,
ctypes.int32_t, ctypes.int32_t);
let ret = myfunc(i);
print(typeof ret); // The result is a JavaScript number.
number
ctypes.int32_t is a CType. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined CTypes such as structs and pointers also - see later.)
This kind of object is called a CData object, and they are described in detail in the "CData objects" section above.
Opaque pointers:
// A new opaque pointer type.
FILE_ptr = new ctypes.PointerType("FILE *");
let fopen = mylib.declare("fopen", ctypes.default_abi,
FILE_ptr, ctypes.char.ptr, ctypes.char.ptr);
let file = fopen("foo", "r");
if (file.isNull())
throw "fopen failed";
file.contents(); // TypeError: type is unknown
(Open issue: fopen("foo", "r") does not work under js-ctypes as currently specified.)
Declaring a struct:
// C prototype: struct s_t { int32_t a; int64_t b; };
const s_t = new ctypes.StructType("s_t", [{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, s_t);
let s = new s_t(10, 20);
This creates an s_t object which allocates enough memory for the whole struct, creates getters and setters to access the binary fields via their offset, and assigns the values 10 and 20 to the fields. The new object's prototype is s_t.prototype.
let i = myfunc(0, s); // checks the type of s
Nested structs:
const u_t = ctypes.StructType("u_t", [{ x: Int64 }, { y: s_t }]);
let u = new u_t(5e4, s); // copies data from s into u.y - no references
let u_field = u.y; // creates an s_t object that points directly to
// the offset of u.y within u.
An out parameter:
// allocate sizeof(uint32_t)==4 bytes,
// initialize to 5, and return a new CData object
let i = new ctypes.uint32_t(5);
// Declare a C function with an out parameter.
const getint = ctypes.declare("getint", ctypes.abi.default,
ctypes.void_t, ctypes.uint32_t.ptr);
getint(i.address()); // explicitly take the address of allocated buffer
(Python ctypes has byref(i) as an alternative to i.address(), but we do not expect users to do the equivalent of from ctypes import *, and setint(ctypes.byref(i)) is a bit much.)
Pointers:
// Declare a C function that returns a pointer.
const getintp = ctypes.declare("getintp", ctypes.abi.default,
ctypes.uint32_t.ptr);
let p = getintp(); // A CData object that holds the returned uint32_t *
// cast from (uint32_t *) to (uint8_t *)
let q = ctypes.cast(ctypes.uint8_t.ptr, p);
// first byte of buffer
let b0 = q.contents(); // an integer, 0 <= b0 < 256
Struct fields:
const u_t = new ctypes.StructType('u_t',
[[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
// allocates sizeof(2*uint32_t) and creates a CData object
let u = new u_t(5, 10);
u.x = 7; // setter for u.x modifies field
let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y)
print(i); // ...which is the primitive number 10
10
i = 5; // doesn't touch u.y
print(u.y);
10
const v_t = new ctypes.StructType('v_t',
[[u_t, 'u'], [ctypes.uint32_t, 'z']]);
// allocates 12 bytes, zeroes them out, and creates a CData object
let v = new v_t;
let w = v.u; // ConvertToJS(reference to v.u) returns CData object
w.x = 3; // invokes setter
setint(v.u.x); // TypeError: setint argument 1 expects type uint32_t *, got int
let p = v.u.addressOfField('x'); // pointer to v.u.x
setint(p); // ok - manually pass address
64-bit integers:
// Declare a function that returns a 64-bit unsigned int.
const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
ctypes.uint64_t, ctypes.char.ptr);
// This autoconverts to a UInt64 object, not a JS number, even though the
// file is presumably much smaller than 4GiB. Converting to a different type
// each time you call the function, depending on the result value, would be
// worse.
let s = getfilesize("/usr/share/dict/words");
print(s instanceof ctypes.UInt64);
true
print(s < 1000000); // Because s is an object, not a number,
false // JS lies to you.
print(s >= 1000000); // Neither of these is doing what you want,
false // as evidenced by the bizarre answers.
print(s); // It has a nice .toString() method at least!
931467
// There is no shortcut. To get an actual JS number out of a
// 64-bit integer, you have to use the ctypes.{Int64,UInt64}.{hi,lo}
// functions.
print(ctypes.UInt64.lo(s))
931467
// (OK, I lied. There is a shortcut. You can abuse the .toString() method.
// WARNING: This can lose precision!)
print(Number(s.toString()))
931467
let i = new ctypes.int64_t(5); // a new 8-byte buffer
let j = i; // another variable referring to the same CData object
j.value = 6; // invokes setter on i, auto-promotes 6 to Int64
print(typeof j.value) // but j.value is still an Int64 object
object
print(j.value instanceof ctypes.Int64)
true
print(j.value);
6
const m_t = new ctypes.StructType(
'm_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]);
let m = new m_t;
const getint64 = ctypes.declare("getint64", ctypes.abi.default,
ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
getint64(m.x); // TypeError: getint64 argument 1 expected type int64_t *,
// got Int64 object
// (because m.x's getter autoconverts to an Int64 object)
getint64(ctypes.addressOfField(m, 'x')); // works
(Open issue: As above, the implicit conversion from JS string to char * in getfilesize("/usr/share/dict/words") does not work in js-ctypes as specified.)
(TODO - make this a real example:)
let i1 = ctypes.int32_t(5); let i2 = ctypes.int32_t(); i2.value = i1 // i2 and i1 have separate binary storage, this is memcpy //you can copy the guts of one struct to another, etc.
Future directions
Callbacks
The libffi part of this is presumably not too bad. Issues:
Lifetimes. C/C++ makes it impossible to track an object pointer. Both JavaScript's GC and experience with C/C++ function pointers will tend to discourage users from caring about function lifetimes.
I think the best solution to this problem is to put the burden of keeping the function alive entirely on the client.
Finding the right context to use. If we burn the cx right into the libffi closure, it will crash when called from a different thread or after the cx is destroyed. If we take a context at random from some internal JSAPI structure, it might be thread-safe, but the context's options and global will be random, which sounds dangerous. Perhaps ctypes itself can create a context per thread, on demand, for the use of function pointers. In a typical application, that would only create one context, if any.
Converting strings
I think we want an explicit API for converting strings, very roughly:
CData objects of certain pointer and array types have methods for reading and writing Unicode strings. These methods are present if the target or element type is an 8-bit character or integer type.
cdata.readString([encoding[, length]]) - Read bytes from cdata and convert them to Unicode characters using the specified encoding, returning a string. Specifically:
- If cdata is an array, let p = a pointer to the first element. Otherwise cdata is a pointer; let p = the value of cdata.
- If encoding is
undefinedor omitted, the selected encoding is UTF-8. Otherwise, if encoding is a string naming a known character encoding, that encoding is selected. Otherwise throw aTypeError. - If length is a size value, cdata is an array, and
length > cdata.length, then throw aTypeError. - Otherwise, if length is a size value, take exactly length bytes starting at p and convert them to Unicode characters according to the selected encoding. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (The result may contain null characters.)
- Otherwise, if length is
undefinedor omitted, convert bytes starting at p to Unicode characters according to the selected encoding. Stop when the end of the array is reached (if cdata is an array) or when a null character (U+0000) is found. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (If cdata is a pointer and there is no trailing null character, this can crash.) - Otherwise throw a
TypeError.
cdata.writeString(s, [encoding[, length]]) - Determine the starting pointer p as above. If s is not a well-formed UTF-16 string, throw a TypeError. (Open issue: Error handling.) Otherwise convert s to bytes in the specified encoding (default: UTF-8) and write at most length - 1 bytes, or all the converted bytes, if length is undefined or omitted, to memory starting at p. Write a converted null character after the data. Return the number of bytes of data written, not counting the terminating null character.
(Open issue: cdata.writeString(...) is awkward for the case where you want an autosized ctypes.char.array() to hold the converted data. If cdata happens to be too small for the resulting string, and you don't supply length, you crash; and if you do supply length, you don't know whether conversion was halted because the target array was of insufficient length.)
(Open issue: As proposed, these are not suitable for working with encodings where a zero byte might not indicate the end of text. For example, a string encoded in UTF-16 will typically contain a lot of zero bytes. Unfortunately, in the case of readString, the underlying library demands the length up front.)
(Open issue: These methods offer no error handling options, which is pretty weak. Real-world code often wants to allow a few characters to be garbled rather than fail. For now we will likely be limited to whatever the underlying codec library, nsIScriptableUnicodeConverter, can do.)
(Open issue: 16-bit versions too, for UTF-16?)
isNull
If we do not convert NULL pointers to JS null (and I may have changed my mind about this) then we need:
cptr.isNull() - Return true if cptr's value is a null pointer, false otherwise.
Auto-converting strings
There are several issues:
Lifetimes. This problem arises when autoconverting from JS to C/C++ only.
When passing a string to a foreign function, like foo(s), what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing s for the duration of the call. But then there are situations like
TenStrings = char.ptr.array(10); var arr = new TenStrings(); arr[0] = s; // What is the lifetime of the data arr[0] points to?
The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.
Non-null-terminated strings. This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).
In C/C++, the type char * effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing char * pointers that aren't logically strings). The workaround would be to declare them as a different type.
Unicode. This problem does not apply to conversions between JS strings and jschar arrays or pointers; only char arrays or pointers.
Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and char strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings, and for those jschar is the right thing.
Casting away const. This problem arises only when converting from a JS string to a C/C++ pointer type. The string data must not be modified, but the C/C++ types char * and jschar * suggest that the referent might be modified.