Jsctypes/api: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Bug 1063962 renamed ctypes.jschar to ctypes.char16_t and, to maintain backwards compatibility, bug 1064935 added an alias for ctypes.jschar to ctypes.char16_t.)
 
(119 intermediate revisions by 4 users not shown)
Line 3: Line 3:
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.


=Proposal 1=
= Libraries =


==== 1. opening a library and declaring a function ====
:'''<code>ctypes.open(''name'')</code>''' - Open a library. ''(TODO: all the details)'' This always returns a <code>Library</code> object or throws an exception.
<pre>Cu.import("ctypes"); // imports the global ctypes object


// searches the path and opens "libmylib.so" on linux,
<code>Library</code> objects have the following methods:
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);


// declares the C prototype int32_t myfunc(int32_t)
:'''<code>''lib''.declare(''name'', ''abi'', ''rtype'', ''<nowiki>[argtype1, ...]</nowiki>'')</code>''' - Declare a function. ''(TODO: all the details)'' This always returns a new callable <code>CData</code> object representing a function pointer to ''name'', or throws an exception.
// Int32 implies ctypes.Int32, shortened for brevity
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32(), Int32());


let ret = myfunc(2); // calls myfunc</pre>
:If ''rtype'' is an array type, this throws a <code>TypeError</code>.
Note that for simple types (integers and strings), we will autoconvert the argument at call time - there's no need to pass in an Int32 object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.


==== 2. declaring and passing a simple type (by object) ====
:If any ''argtypeN'' is an array type, the result is the same as if it had been the corresponding pointer type, <code>''argtypeN''.elementType.ptr</code>. ''(Rationale: This is how C and C++ treat array types in function declarations.)''
<pre>let myfunc = mylib.declare("myfunc", DEFAULT, Int32, Int32);
let i = new Int32(); // instantiates an Int32 object with default value 0
let ret = myfunc(i);</pre>
An Int32 object, like all other type objects in ctypes, can be used for type specification when passed as an object, as above. declare() can look at the prototype JSObject* of its argument, and use this as a canonical JSObject representing the type, a pointer to which can be used for simple type equality comparisons. (This will work for user-defined types such as structs also - see later - though for pointer types we need to dig down to the underlying type.)


Int32() can have two modes depending on whether JS_IsConstructing(cx) is JS_TRUE ("new Int32()") or JS_FALSE ("Int32()"). Used as a function, we could perform a type conversion with range checking, for instance:
''(TODO: Explain what happens when you call a declared function. In brief: It uses <code>ImplicitConvert</code> to convert the JavaScript arguments to C and <code>ConvertToJS</code> to convert the return value to JS.)''
<pre>let n = Int32(4); // JSVAL_IS_INT(n) == JS_TRUE
n = Int32(4e16); // RangeError - out of bounds
n = Int32.max; // 2^31 - 1
// etc</pre>
For the new constructor, the resulting object stores three pieces of information internally in reserved slots. |new Int32()| creates a JSObject which allocates sizeof(int32_t) and stores that pointer in a private slot. It also stores its type, as a JSObject* pointing to the canonical Int32 prototype, and can store a parent JSObject* in case it refers to an Int32 that happens to be part of another object. Thus the slot layout of i above would be


i object:<br>&nbsp; slot 1 (parent): JSObject* -&gt; NULL (no parent object)<br>&nbsp; slot 2 (type) : JSObject* -&gt; Int32 prototype<br>&nbsp; slot 3 (value) : void* -&gt; binary blob from malloc(sizeof(int32_t))
= Types =


Do we need to provide an explicit set() method, to allow for efficient modification? For instance,
A ''type'' maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
<pre>i.set(5); // cheaper than i = new Int32(5);</pre>
==== 3. declaring and passing a pointer ====
<pre>// C prototype: int32_t myfunc(int32_t* p)
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, Pointer(Int32));
let p = new Pointer(new Int32()); // instantiates an int and a pointer
let ret = myfunc(p); // the int is an outparam
let i = p.contents(); // i = *p (by reference)
let a = p.address(); // 0x...


// same thing, but with a named integer
''(Types and their prototypes are extensible: scripts can add new properties to them. Rationale: This is how most JavaScript constructors behave.)''
let i = new Int32();
let p = new Pointer(i);
let ret = myfunc(p); // modifies i


// same thing, but with a pointer temporary
== Built-in types ==
let i = new Int32();
let ret = myfunc(new Pointer(i)); // modifies i


// other examples
ctypes provides the following types:
let q = new Pointer(); // instantiate a null pointer to a void type
q = new Pointer(5); // TypeError - require a ctypes type</pre>
Internally, a pointer requires a backing object (unless it's a null pointer). In the examples, the Pointer JSObject holds a reference to the Int32 JSObject for rooting purposes, and is laid out similarly to an Int32 object:


p object:<br>&nbsp; slot 1 (parent): JSObject* -&gt; Int32 backing object<br>&nbsp; slot 2 (type) : JSObject* -&gt; Pointer prototype<br>&nbsp; slot 3 (value) : void* -&gt; pointer to binary int32_t blob inside backing object
:'''<code>ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t</code>''' - Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).


==== 4. declaring a pointer to opaque struct ====
:Since some 64-bit values are outside the range of the JavaScript number type, <code>ctypes.int64_t</code> and <code>ctypes.uint64_t</code> do not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper types <code>ctypes.Int64</code> and <code>ctypes.UInt64</code> (which are JavaScript object types, not <code>CType</code>s). See "64-bit integer objects" below.
<pre>const FILE = ctypes.Struct(); // creates a Struct() type with no allocated binary storage, and no fields to access
let fopen = mylib.declare("fopen", DEFAULT_ABI, Pointer(FILE), String);
let file = fopen("foo"); // creates a new Pointer() object
file.contents(); // will throw - type is unknown
file.address(); // ok</pre>
==== 5. declaring a struct ====
<pre>// C prototype: struct s_t { int32_t a; int64_t b };
const s_t = Struct([{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", DEFAULT_ABI, Int32, s_t);


let s = new s_t(10, 20);</pre>
:'''<code>ctypes.size_t, ssize_t, intptr_t, uintptr_t</code>''' - Primitive types whose size depends on the platform. ''(These types do not autoconvert to JavaScript numbers. Instead they convert to wrapper objects, even on 32-bit platforms. See "64-bit integer objects" below. Rationale: On 64-bit platforms, there are values of these types that cannot be precisely represented as JS numbers. It will be easier to write code that works on multiple platforms if the builtin types autoconvert in the same way on all platforms.)''
This creates an s_t object which allocates binary space for both fields, creates getters and setters to access the binary fields via their offset, assigns the values 10 and 20 to the fields, and whose prototype is s_t:


s object:<br>&nbsp; slot 1 (parent): JSObject* -&gt; NULL<br>&nbsp; slot 2 (type) : JSObject* -&gt; s_t prototype<br>&nbsp; slot 3 (value) : void* -&gt; pointer to binary blob from malloc()<br>&nbsp; slot 4 (fields): array of data for each field:<br>&nbsp;&nbsp;&nbsp; { JSObject* parent; JSObject* type; ptrdiff_t offset; }
:'''<code>ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double</code>''' - Types that behave like the corresponding C types. As in C, <code>unsigned</code> is always an alias for <code>unsigned_int</code>.


The array of field information allows each field to be dependent on another JSObject (only for the case where the field is a pointer), have an associated type, and have an offset into the binary blob for ease of access.
:''(<code>ctypes.long</code> and <code>ctypes.unsigned_long</code> autoconvert to 64-bit integer objects on all platforms. The rest autoconvert to JavaScript numbers. Rationale: Some platforms have 64-bit <code>long</code> and some do not.)''
<pre>let c = s.b; // invokes the getter for |b| to create an Int64 object like so:</pre>
c object:<br>&nbsp; slot 1 (parent): JSObject* -&gt; s backing object<br>&nbsp; slot 2 (type) : JSObject* -&gt; Int64 prototype<br>&nbsp; slot 3 (value) : void* -&gt; pointer to binary int64_t blob inside backing object
<pre>let i = myfunc(s); // checks the type of s by JSObject* prototype equality</pre>
==== 6. pointers to struct fields ====
<pre>let p = new Pointer(s.b);</pre>
Once the Int64 representing s.b is constructed, the Pointer object references it directly:


p object:<br>&nbsp; slot 1 (parent): JSObject* -&gt; Int64 backing object (which, in turn, is backed by s)<br>&nbsp; slot 2 (type) : JSObject* -&gt; Pointer prototype<br>&nbsp; slot 3 (value) : void* -&gt; pointer to binary int64_t blob inside backing object
:'''<code>ctypes.char, ctypes.signed_char, ctypes.unsigned_char</code>''' - Character types that behave like the corresponding C types. (These are very much like <code>int8_t</code> and <code>uint8_t</code>, but they differ in some details of conversion. For example, <code>ctypes.char.array(30)(str)</code> converts the string ''str'' to UTF-8 and returns a new <code>CData</code> object of array type.)


==== 7. nested structs ====
:'''<code>ctypes.char16_t</code>''' - A 16-bit unsigned character type representing a UTF-16 code unit. (This is distinct from <code>uint16_t</code> in details of conversion behavior. js-ctypes autoconverts C <code>char16_t</code>s to JavaScript strings of length 1.) For backwards compatibility, <code>ctypes.jschar</code> is an alias for <code>char16_t</code>.
<pre>const u_t = Struct([{ x: Int64 }, { y: s_t }]);
let u = new u_t(5e4, s); // copies data from s into u.y - no references


let u_field = u.y; // creates an s_t object that points directly to the offset of u.y within u.
:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type. (<code>void</code> is a keyword in JavaScript.)


const v_t = Struct([{ x: Pointer(s_t) }, { y: Pointer(s_t) }]);
:'''<code>ctypes.voidptr_t</code>''' - The C type <code>void *</code>.
let v = new v_t(new Pointer(s), new Pointer(s));</pre>
In this case, the fields array will each have their respective Pointer as the parent object, and both will point to the s binary blob.<br>


= Proposal 2=
The ''wrapped integer types'' are the types <code>int64_t</code>, <code>uint64_t</code>, <code>size_t</code>, <code>ssize_t</code>, <code>intptr_t</code>, <code>uintptr_t</code>, <code>long</code>, and <code>unsigned_long</code>. These are the types that autoconvert to 64-bit integer objects rather than to primitive JavaScript numbers.


== Types ==
== User-defined types ==


A ''type'' maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
Starting from the builtin types above, these functions can be used to create additional types:


=== The types provided by ctypes ===
:'''<code>new ctypes.PointerType(''t'')</code>''' - If ''t'' is a <code>CType</code>, return the type "pointer to ''t''". The result is cached so that future requests for this pointer type produce the same <code>CType</code> object. If ''t'' is a string, instead return a new opaque pointer type named ''t''. Otherwise throw a <code>TypeError</code>.


ctypes provides the following builtin types:
:'''<code>new ctypes.FunctionType(''abi'', ''rt'', [ ''at1'', ... ])</code>''' - Return a function pointer <code>CType</code> corresponding to the C type <code>rt (*) (at1, ...)</code>, where ''abi'' is a ctypes ABI type and ''rt'' and ''at1'', ... are <code>CType</code>s. Otherwise throw a <code>TypeError</code>.


:'''<code>ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t</code>''' - Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's nothing we can realistically do about it).
:'''<code>new ctypes.ArrayType(''t'')</code>''' - Return an array type with unspecified length and element type ''t''.  If ''t'' is not a type or <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>.


:'''<code>ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double</code>''' - Types that behave like the corresponding C types. Some or all of these might be aliases for the primitive types listed above. As in C, <code>unsigned</code> is always an alias for <code>unsigned_int</code>.
:'''<code>new ctypes.ArrayType(''t'', ''n'')</code>''' - Return the array type ''t''[''n'']. If ''t'' is not a type or <code>''t''.size</code> is <code>undefined</code> or ''n'' is not a size value (defined below), throw a <code>TypeError</code>. If the size of the resulting array type, in bytes, would not be exactly representable both as a <code>size_t</code> and as a JavaScript number, throw a <code>RangeError</code>.
 
:'''<code>ctypes.char, ctypes.signed_char, ctypes.unsigned_char</code>''' - Character types that behave like the corresponding C types. (These are distinct from <code>int8_t</code> and <code>uint8_t</code> in details of conversion behavior. For example, js-ctypes autoconverts between C characters and one-character JavaScript strings.)
 
:'''<code>ctypes.string, ustring</code>''' - String types. The C/C++ type for <code>ctypes.string</code> is <code>const char *</code>. C/C++ values of this type must be either <code>null</code> or pointers to null-terminated strings. <code>ctypes.ustring</code> is the same, but for <code>const jschar *</code>; that is, the code units of the string are <code>uint16_t</code>.
 
:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type.  (<code>void</code> is a keyword in JavaScript.)
 
:'''<code>ctypes.voidptr_t</code>''' - The C type <code>void *</code>.


Starting from those builtin types, ctypes can create additional types:
:A ''size value'' is either a non-negative, integer-valued primitive number, an <code>Int64</code> object with a non-negative value, or a <code>UInt64</code> object.


:'''<code>new ctypes.PointerType(''t'')</code>''' - If ''t'' is a ctypes type, return the type "pointer to ''t''". If ''t'' is a string, instead return a new opaque pointer type named ''t''. Otherwise throw a <code>TypeError</code>.
:''(Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)''


:'''<code>new ctypes.ArrayType(''t'')</code>''' - Return an array type with unspecified length and element type ''t''.  If ''t'' is not a type or <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>.
:'''<code>new ctypes.StructType(''name'', ''fields'')</code>''' - Create a new struct type with the given ''name'' and ''fields''. ''fields'' is an array of field descriptors, of the format


:'''<code>new ctypes.ArrayType(''t'', ''n'')</code>''' - Return the array type ''T''[''n'']. If ''t'' is not a type, or <code>''t''.size</code> is <code>undefined</code>, or ''n'' is not a nonnegative integer, throw a <code>TypeError</code>.
:<code>[ { field1: type1 }, { field2: type2 }, ... ]</code>


:'''<code>new ctypes.StructType(''name'', ''fields'')</code>''' - Create a new struct type with the given ''name'' and ''fields''. ''fields'' is an array of field descriptors. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If ''name'' is not a string, or ''fields'' contains a field descriptor with a type ''t'' such that <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>.
:where <code>field''n''</code> is a string denoting the name of the field, and <code>type''n''</code> is a ctypes type. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If ''name'' is not a string, or any <code>type''n''</code> is such that <code>type''n''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>. If the size of the struct, in bytes, would not be exactly representable both as a <code>size_t</code> and as a JavaScript number, throw a <code>RangeError</code>.


''(Open issue: Specify a way to tell <code>ctypes.StructType</code> to use <code>#pragma pack(n)</code>.)''
''(Open issue: Specify a way to tell <code>ctypes.StructType</code> to use <code>#pragma pack(n)</code>.)''
''(TODO: Finish specifying field descriptors.)''


These constructors behave exactly the same way when called without the <code>new</code> keyword.
These constructors behave exactly the same way when called without the <code>new</code> keyword.
Line 139: Line 78:
  const HANDLE = new ctypes.PointerType("HANDLE");
  const HANDLE = new ctypes.PointerType("HANDLE");
  const HANDLES = new ctypes.ArrayType(HANDLE);
  const HANDLES = new ctypes.ArrayType(HANDLE);
  const FILE = new ctypes.PointerType("FILE *");
  const FILE = new ctypes.StructType("FILE").ptr;
  const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
  const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
   
   
  const struct_tm = new ctypes.StructType('tm', [[ctypes.int, 'tm_sec'], ...]);
  const struct_tm = new ctypes.StructType('tm', [{'tm_sec': ctypes.int}, ...]);
const comparator_t = new ctypes.FunctionType(ctypes.default_abi, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);


=== Properties of types ===
== Properties of types ==


All the fields described here are read-only.
All the fields described here are read-only.


All types have these properties:
All types have these properties and methods:


:'''<code>''t''.size</code>''' - The C/C++ <code>sizeof</code> the type, in bytes.
:'''<code>''t''.size</code>''' - The C/C++ <code>sizeof</code> the type, in bytes. The result is a primitive number, not a <code>UInt64</code> object.


:If ''t'' is an array type with unspecified length, <code>''t''.size</code> is <code>undefined</code>.
:If ''t'' is an array type with unspecified length, <code>''t''.size</code> is <code>undefined</code>.
Line 158: Line 99:
:'''<code>''t''.name</code>''' - A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
:'''<code>''t''.name</code>''' - A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.


:For primitive types this is just the name of the corresponding C/C++ type, e.g. <code>ctypes.int32_t.name == "int32_t"</code> and <code>ctypes.void_t == "void"</code>. But some of the builtin types are aliases for other types, so it might be that <code>ctypes.unsigned_long.name == "uint32_t"</code> (or something else). ''(Open issue: Is that too astonishing?)''
:For primitive types this is just the name of the corresponding C/C++ type.
 
:For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other function, pointer, and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
 
:''(Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)''
 
ctypes.int32_t.name
  ===> "int32_t"
ctypes.void_t.name
  ===> "void"
ctypes.char16_t.ptr.name
  ===> "char16_t *"
const FILE = new ctypes.StructType("FILE").ptr;
FILE.name
  ===> "FILE*"
const fn_t = new ctypes.FunctionType(ctypes.stdcall, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);
fn_t.name
  ===> "int (__stdcall *)(void*, void*)"
const struct_tm = new ctypes.StructType("tm", [{tm_sec: ctypes.int}, ...]);
struct_tm.name
  ===> "tm"
// Pointer-to-array types are not often used in C/C++.
// Such types have funny-looking names.
const ptrTo_ptrTo_arrayOf4_strings =
    new ctypes.PointerType(
      new ctypes.PointerType(
        new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4)));
ptrTo_ptrTo_arrayOf4_strings.name
  ===> "char *(**)[4]"
 
:'''<code>''t''.ptr</code>''' - Return <code>ctypes.PointerType(''t'')</code>.
 
:'''<code>''t''.array()</code>''' - Return <code>ctypes.ArrayType(''t'')</code>.
 
:'''<code>''t''.array(''n'')</code>''' - Return <code>ctypes.ArrayType(''t'', ''n'')</code>.
 
:Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
 
const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;


:For struct types and opaque pointer types, this is simply the string that was passed to the constructor; e.g. <code>FILE.name == "FILE *"</code> and <code>struct_tm.name == "tm"</code>. For other pointer types and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
:''(<code>.array()</code> requires parentheses but <code>.ptr</code> doesn't. Rationale: <code>.array()</code> has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:  <code>int [10]</code> --> <code>ctypes.int.array(10)</code>. Writing a pointer type does not require the brackets.)''


:''(Open issue: This conflicts with the usual meaning of .name for functions, and types are functions.)''
:'''<code>''t''.toString()</code>''' - Return <code>"type " + ''t''.name</code>.


:'''<code>''t''.toString()</code>''' - Returns <code>"type " + ''t''.name</code>.
:'''<code>''t''.toSource()</code>''' - Return a JavaScript expression that evaluates to a <code>CType</code> describing the same C/C++ type as ''t''.
 
ctypes.uint32_t.toSource()
  ===> "ctypes.uint32_t"
ctypes.string.toSource()
  ===> "ctypes.string"
const charPtr = new ctypes.PointerType(ctypes.char);
charPtr.toSource()
  ===> "ctypes.char.ptr"
const Point = new ctypes.StructType(
    "Point", [{x: ctypes.int32_t}, {y: ctypes.int32_t}]);
Point.toSource()
  ===> "ctypes.StructType("Point", [{x: ctypes.int32_t}, {y: ctypes.int23_t}])"


Pointer types also have:
Pointer types also have:


:'''<code>''t''.targetType</code>''' - The pointed-to type, or <code>null</code> if ''t'' is an opaque pointer type.
:'''<code>''t''.targetType</code>''' - Read-only. The pointed-to type, or <code>null</code> if ''t'' is an opaque pointer type.
 
Function types also have:
 
:'''<code>''t''.abi</code>''' - Read-only. The ABI of the function; one of the ctypes ABI objects.
 
:'''<code>''t''.returnType</code>''' - Read-only. The return type.
 
:'''<code>''t''.argTypes</code>''' - Read-only. A sealed array of argument types.


Struct types also have:
Struct types also have:


:'''<code>''t''.fields</code>''' - A sealed array of field descriptors, details TBD.
:'''<code>''t''.fields</code>''' - Read-only. A sealed array of field descriptors. ''(TODO: Details.)''


Array types also have:
Array types also have:
Line 178: Line 183:
:'''<code>''t''.elementType</code>''' - The type of the elements of an array of this type.  E.g. <code>IOBuf.elementType === ctypes.uint8_t</code>.
:'''<code>''t''.elementType</code>''' - The type of the elements of an array of this type.  E.g. <code>IOBuf.elementType === ctypes.uint8_t</code>.


:'''<code>''t''.length</code>''' - The number of elements, a nonnegative integer.
:'''<code>''t''.length</code>''' - The number of elements, a non-negative integer; or <code>undefined</code> if this is an array type with unspecified length. ''(The result, if not <code>undefined</code>, is a primitive number, not a <code>UInt64</code> object. Rationale: Having <code>.length</code> produce anything other than a number is foreign to JS, and arrays of more than 2<sup>53</sup> elements are currently unheard-of.)''


Minutiae:
Minutiae:


:The <nowiki>[[Class]]</nowiki> of a ctypes type is <code>"Function"</code>.  ''(Open issue: Subject to possible implementation difficulties.)''
:'''<code>ctypes.CType</code>''' is the abstract-base-class constructor of all js-ctypes types. If called, it throws a <code>TypeError</code>. (This is exposed in order to expose <code>ctypes.CType.prototype</code>.)
 
:The <nowiki>[[Class]]</nowiki> of a ctypes type is <code>"CType"</code>.
 
:The <nowiki>[[Class]]</nowiki> of the type constructors <code>ctypes.{C,Array,Struct,Pointer}Type</code> is <code>"Function"</code>.
 
:Every <code>CType</code> has a read-only, permanent <code>.prototype</code> propertyThe type-constructors <code>ctypes.{C,Pointer,Struct,Array}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.


:Every ctypes type has a read-only, permanent <code>.prototype</code> property. The type-constructors <code>ctypes.{Pointer,Struct,Array,}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.
:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.CType.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer,Function}Type.prototype</code> and of all the builtin types except <code>ctypes.voidptr_t</code> is <code>ctypes.CType.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a pointer type is <code>ctypes.PointerType.prototype</code>. The prototype of a function type is <code>ctypes.FunctionType.prototype</code>.


:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.Type.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer}Type.prototype</code> and of all the builtin types except for the string types and <code>ctypes.voidptr_t</code> is <code>ctypes.Type.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a string type or pointer type is <code>ctypes.PointerType.prototype</code>.
:Every <code>CType</code> ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. The same is true of the five type constructors <code>ctypes.{C,Array,Struct,Pointer,Function}Type</code>.


:Every ctypes type ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. Also:
== Calling types ==
:<code>ctypes.Type.prototype.constructor === ctypes.Type</code>
:<code>ctypes.ArrayType.prototype.constructor === ctypes.ArrayType</code>
:<code>ctypes.StructType.prototype.constructor === ctypes.StructType</code>
:<code>ctypes.PointerType.prototype.constructor === ctypes.PointerType</code>


=== Calling types ===
<code>CType</code>s are JavaScript constructors. That is, they are functions, and they can be called to create new objects.  (The objects they create are called <code>CData</code> objects, and they are described in the next section.)


js-ctypes types are JavaScript constructors. That is, they are functions, and they can be called in various different ways.  (js-ctypes buffers and references are to be described in later sections.)
:'''<code>new ''t''</code>''' or '''<code>new ''t''()</code>''' or '''<code>''t''()</code>''' - Create a new <code>CData</code> object of type ''t''.


:'''<code>new ''t''</code>''' or '''<code>new ''t''()</code>''' - Without arguments, these allocate a new buffer of <code>''t''.size</code> bytes, populate it with zeroes, and return a new ctypes reference to the complete object in that buffer.
:Without arguments, these allocate a new buffer of <code>''t''.size</code> bytes, populate it with zeroes, and return a new <code>CData</code> object referring to the complete object in that buffer.


:If <code>''t''.size</code> is <code>undefined</code>, this throws a <code>TypeError</code>.
:If <code>''t''.size</code> is <code>undefined</code>, this throws a <code>TypeError</code>.


:'''<code>''t''()</code>''' - Return <code>ConvertToJS(new t)</code>. (<code>ConvertToJS</code> is defined below. The result is that <code>ctypes.bool() === false</code>, for number types <code>''t''() === 0</code>, for character types <code>''t''() === '\0'</code>, and for string and pointer types <code>''t''() === null</code> on platforms where the null pointer is all zeroes. For all other types the result is the same as <code>new ''t''</code>.)
:'''<code>new ''t''(''val'')</code>''' or '''<code>''t''(''val'')</code>''' - Create a new <code>CData</code> object as follows:
 
:* If <code>''t''.size</code> is not <code>undefined</code>: Convert ''val'' to type ''t'' by calling <code>ExplicitConvert(''val'', ''t'')</code>, throwing a <code>TypeError</code> if the conversion is impossible. Allocate a new buffer of <code>''t''.size</code> bytes, populated with the converted value. Return a new <code>CData</code> object of type ''t'' referring to the complete object in that buffer. (When ''val'' is a <code>CData</code> object of type ''t'', the behavior is like <code>malloc</code> followed by <code>memcpy</code>.)
 
:* If ''t'' is an array type of unspecified length:
 
::* If ''val'' is a size value (defined above): Let ''u'' = <code>ArrayType(''t''.elementType, ''val'')</code> and return <code>new ''u''</code>.
 
::* If <code>''t''.elementType</code> is <code>char16_t</code> and ''val'' is a string: Return a new <code>CData</code> object of type <code>ArrayType(ctypes.char16_t, ''val''.length&nbsp;+&nbsp;1)</code> containing the contents of ''val'' followed by a null character.
 
::* If <code>''t''.elementType</code> is an 8-bit character type and ''val'' is a string: If ''val'' is not a well-formed UTF-16 string, throw a <code>TypeError</code>. Otherwise, let ''s'' = a sequence of bytes, the result of converting ''val'' from UTF-16 to UTF-8, and let ''n'' = the number of bytes in ''s''. Return a new <code>CData</code> object of type <code>ArrayType(''t''.elementType, ''n'' + 1)</code> containing the bytes in ''s'' followed by a null character.
 
::* If ''val'' is a JavaScript array object and <code>''val''.length</code> is a nonnegative integer, let ''u'' = <code>ArrayType(''t''.elementType, ''val''.length)</code> and return <code>new ''u''(''val'')</code>. ''(Array <code>CData</code> objects created in this way have <code>''cdata''.constructor === ''u''</code>, not ''t''. Rationale: For all <code>CData</code> objects, <code>cdata.constructor.size</code> gives the size in bytes, unless a struct field shadows <code>cdata.constructor</code>.)''
 
::* Otherwise, throw a <code>TypeError</code>.
 
:* Otherwise, ''t'' is <code>void_t</code>. Throw a <code>TypeError</code>.
 
let a_t = ctypes.ArrayType(ctypes.int32_t);
let a = new a_t(5);
a.length
  ===> 5
a.constructor.size
  ===> 20
 
= CData objects =
 
A <code>CData</code> object represents a C/C++ value located in memory. The address of the C/C++ value can be taken (using the <code>.address()</code> method), and it can be assigned to (using the <code>.value</code> property).
 
Every <code>CData</code> object has a ''type'', the <code>CType</code> object that describes the type of the C/C++ value.
 
Minutiae:
 
:The <nowiki>[[Class]]</nowiki> of a <code>CData</code> object is <code>"CData"</code>.
 
:The prototype of a <code>CData</code> object is the same as its type's <code>.prototype</code> property.
 
''(Implementation notes: A <code>CData</code> object has a reserved slot that points to its type; a reserved slot that contains <code>null</code> if the object owns its own buffer, and otherwise points to the base <code>CData</code> object that owns the backing buffer where the data is stored; and a data pointer. The data pointer points to the actual location within the buffer of the C/C++ object to which the <code>CData</code> object refers. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it. Other <code>CData</code> objects that point into the buffer keep the base <code>CData</code>, and therefore the underlying buffer, alive.)''
 
== Properties and methods of CData objects ==
 
All <code>CData</code> objects have these methods and properties:
 
:'''<code>''cdata''.address()</code>''' - Return a new <code>CData</code> object of the pointer type <code>ctypes.PointerType(cdata.constructor)</code> whose value points to the C/C++ object referred to by ''cdata''.
 
:''(Open issue: Does this pointer keep ''cdata'' alive? Currently not but we could easily change it. It is impossible to have all pointers keep their referents alive in a totally general way--consider pointers embedded in structs and arrays. But this special case would be pretty easy to hack: put a <code>.contents</code> property on the resulting pointer, referring back to ''cdata''.)''
 
:'''<code>''cdata''.constructor</code>''' - Read-only. The type of ''cdata''. ''(This is never <code>void_t</code> or an array type with unspecified length. Implementation note: The prototype of ''cdata'' is an object that has a read-only <code>constructor</code> property, as detailed under "minutiae".)''


:'''<code>new ''t''(''val'')</code>''' - Convert ''val'' to type ''t'' according to the explicit conversion rules below, throwing a <code>TypeError</code> if the conversion is impossible. Create a new buffer and reference as above, populating the new buffer with the converted value instead of zeroing it out. (When ''val'' is a reference of type ''t'', the behavior is like <code>malloc</code> followed by <code>memcpy</code>.)
:'''<code>''cdata''.toSource()</code>''' - Return the string "''t''(''arg'')" where ''t'' and ''arg'' are implementation-defined JavaScript expressions (intended to represent the type of <code>''cdata''</code> and its value, respectively). The intent is that <code>eval(''cdata''.toSource())</code> should ideally produce a new <code>CData</code> object containing a copy of ''cdata'', but this can only work if the type of <code>''cdata''</code> happens to be bound to an appropriate name in scope.


:As a special case, if ''t'' is an array type of unspecified length and ''val'' is a nonnegative integer, allocate a new buffer of size <code>''val'' * ''t''.elementType.size</code>. Populate it with zeroes. Return a reference to the new array.
:'''<code>''cdata''.toString()</code>''' - Return the same string as <code>''cdata''.toSource()</code>.


:'''<code>''t''(''val'')</code>''' - Return <code>ConvertToJS(new ''t''(''val''))</code>.
The <code>.value</code> property has a getter and a setter:


== References ==
:'''<code>''cdata''.value</code>''' - Let ''x'' = <code>ConvertToJS(''cdata'')</code>. If <code>''x'' === ''cdata''</code>, throw a <code>TypeError</code>. Otherwise return ''x''.


js-ctypes references are like C++ references: they have a type, they refer to a C++ value, their address can be taken, and they can be assigned to. However, the JavaScript syntax is very different from the C++ syntax.
:'''<code>''cdata''.value = ''val''</code>''' - Let ''cval'' = <code>ImplicitConvert(''val'', ''cdata''.constructor)</code>. If conversion fails, throw a <code>TypeError</code>. Otherwise assign the value ''cval'' to the C/C++ object referred to by ''cdata''.


''(TODO)''
== Structs ==


:'''<code>''ref''.assign(''val'')</code>''' - Convert ''val'' to the type of ''ref'' using the implicit conversion rules. Store the converted value in the buffer location referred to by ''ref''.
<code>CData</code> objects of struct types also have this method:


:'''<code>''ref''.constructor</code>''' - Read-only. The type of the reference. ''(Implementation note: The prototype of ''ref'' is an object that has a read-only <code>constructor</code> property, as detailed under "minutiae".)''
:'''<code>''cstruct''.addressOfField(''name'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type, whose value points to the field of ''cstruct'' with the name ''name''. If ''name'' is not a JavaScript string or does not name a member of ''cstruct'', throw a <code>TypeError</code>.


Struct references have getters and setters for each struct member:
They also have getters and setters for each struct member:


:'''<code>''cstruct''.''member''</code>''' - Let ''R'' be a reference to the struct member. Return <code>ConvertToJS(''R'')</code>.
:'''<code>''cstruct''.''member''</code>''' - Let ''F'' be a <code>CData</code> object referring to the struct member. Return <code>ConvertToJS(''F'')</code>.


:'''<code>''cstruct''.''member'' = ''value''</code>''' - The value is converted to the type of the member using the implicit conversion rules. The converted value is stored in the buffer.
:'''<code>''cstruct''.''member'' = ''val''</code>''' - Let ''cval'' = <code>ImplicitConvert(''val'', the type of the member)</code>. If conversion fails, throw a <code>TypeError</code>. Otherwise store ''cval'' in the appropriate member of the struct.


These getters and setters can shadow the properties and methods described above.
These getters and setters can shadow the properties and methods described above.


Likewise, array references have getters and setters for each element. Arrays additionally have a <code>length</code> property.
== Pointers ==
 
<code>CData</code> objects of pointer types also have this property:
 
:'''<code>''cptr''.''contents''</code>''' - Let ''C'' be a <code>CData</code> object referring to the pointed-to contents of ''cptr''. Return <code>ConvertToJS(''C'')</code>.
 
:'''<code>''cptr''.''contents'' = ''val''</code>''' - Let ''cval'' = <code>ImplicitConvert(''val'', the base type of the pointer)</code>. If conversion fails, throw a <code>TypeError</code>. Otherwise store ''cval'' in the pointed-to contents of ''cptr''.
 
== Functions ==
 
<code>CData</code> objects of function types are callable:
 
:'''<code>''let result = cfn(arg''1'', ...)''</code>''' - Let ''(carg''1'', ...)'' be <code>CData</code> objects representing the arguments to the C function ''cfn'', and ''cresult'' be a <code>CData</code> object representing its return value. Let ''carg''n = <code>ImplicitConvert(''arg''n, the type of the argument)</code>, and let ''result'' = <code>ConvertToJS(''cresult'')</code>. Call the C function with arguments represented by ''(carg''1'', ...)'', and store the result in ''cresult''. If conversion fails, throw a <code>TypeError</code>.
 
== Arrays ==
 
Likewise, <code>CData</code> objects of array types have getters and setters for each element. Arrays additionally have a <code>length</code> property.


Note that these getters and setters are only present for integers ''i'' in the range 0 &le; i &lt; <code>''carray''.length</code>.  ''(Open issue: can we arrange to throw an exception if ''i'' is out of range?)''
Note that these getters and setters are only present for integers ''i'' in the range 0 &le; i &lt; <code>''carray''.length</code>.  ''(Open issue: can we arrange to throw an exception if ''i'' is out of range?)''


:'''<code>''carray''[''i'']</code>''' - Let ''R'' be a reference to the element at index ''i''. Return <code>ConvertToJS(''R'')</code>.
:'''<code>''carray''[''i'']</code>''' - Let ''E'' be a <code>CData</code> object referring to the element at index ''i''. Return <code>ConvertToJS(''E'')</code>.
 
:'''<code>''carray''[''i''] = ''val''</code>''' - Let ''cval'' = <code>ImplicitConvert(''val'', ''carray''.elementType)</code>. If conversion fails, throw a <code>TypeError</code>. Otherwise store ''cval'' in element ''i'' of the array.
 
:'''<code>''carray''.length</code>''' - Read-only. The length of the array as a JavaScript number.  ''(The same as <code>carray.constructor.length</code>. This is not a <code>UInt64</code> object. Rationale: Array <code>CData</code> objects should behave like other array-like objects for easy duck typing.)''
 
:'''<code>''carray''.addressOfElement(''i'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type (<code>ctypes.PointerType(''carray''.constructor.elementType)</code>) whose value points to element ''i'' of ''carray''. If ''i'' is not a JavaScript number that is a valid index of ''carray'', throw a <code>TypeError</code>.
 
''(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)''
 
== Aliasing ==
 
Note that it is possible for several <code>CData</code> objects to refer to the same or overlapping memory. (In this way <code>CData</code> objects are like C++ references.) For example:
 
const Point = new ctypes.StructType(
    "Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]);
const Rect = new ctypes.StructType(
    "Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]);
var r = Rect();    // a new CData object of type Rect
var p = r.topLeft;  // refers to the topLeft member of r, not a copy
r.topLeft.x = 100;  // This would not work if `r.topLeft` was a copy!
r.topLeft.x
  ===> 100          // It works...
p.x                // and p refers to the same C/C++ object...
  ===> 100          // so it sees the change as well.
r.toSource()
  ===> "Rect({topLeft: {x: 100, y: 0}, bottomRight: {x: 0, y: 0}})"
p.x = 1.0e90;      // Assigning a value out of range is an error.
  **** TypeError
// The range checking is great, but it can have surprising
// consequences sometimes:
p.x = 0x7fffffff;  // (the maximum int32_t value)
p.x++;              // p.x = 0x7fffffff + 1, which is out of range...
  **** TypeError    // ...so this fails, leaving p.x unchanged.
// But JS code doesn't need to do that very often.
// To make this to roll around to -0x80000000, you could write:
p.x = (p.x + 1) | 0; // In JS, `x|0` truncates a number to int32.
 
== Casting ==
 
:'''<code>ctypes.cast(''cdata'', ''t'')</code>''' - Return a new <code>CData</code> object which points to the same memory block as ''cdata'', but with type ''t''. If <code>''t''.size</code> is undefined or larger than <code>''cdata''.constructor.size</code>, throw a <code>TypeError</code>. This is like a C cast or a C++ <code>reinterpret_cast</code>.


:'''<code>''carray''[''i''] = ''val''</code>''' - Convert ''val'' to the type of the array element using the implicit conversion rules and store the result in the buffer.
== Equality ==


:'''<code>''carray''.length</code>''' - Read-only. The length of the array.
According to the ECMAScript standard, if ''x'' and ''y'' are two different objects, then <code>x === y</code> and <code>x == y</code> are both false. This has consequences for code that uses js-ctypes pointers, pointer-sized integers, or 64-bit integers, because all these values are represented as JavaScript objects. In C/C++, the <code>==</code> operator would compare values of these types for equality. Not so in js-ctypes:


''(TODO: Figure out if the type of <code>new FooArray(30)</code> is <code>FooArray</code> or <code>ArrayType(Foo, 30)</code>.)''
const HANDLE = new ctypes.PointerType("HANDLE");
const INVALID_HANDLE_VALUE = HANDLE(-1);
const kernel32 = ctypes.open("kernel32");
const CreateMutex = kernel32.declare("CreateMutex", ...);
var h = CreateMutex(null, false, null);
if (h == INVALID_HANDLE_VALUE)   // BAD - always false
    ...


''(TODO: Possibly, a way to get a reference that acts like a view on a window of an array. E.g. ''carray''.slice(start, stop). Then you could <code>.assign</code> one region of memory to another, effectively memcpy-ing.)''
This comparison is always false because <code>CreateMutex</code> returns a new <code>CData</code> object, which of course will be a different object from the existing value of <code>INVALID_HANDLE_VALUE</code>.


Minutiae:
''(Python ctypes has the same issue. It isn't mentioned in the docs, but:''
 
>>> from ctypes import *
>>> c_void_p(0) == c_void_p(0)
False
>>> c_int(33) == c_int(33)
False
 
''We could overload operator== using the nonstandard hook <code>JSExtendedClass.equality</code> but it might not be worth it.)''
 
= 64-bit integer objects =
 
Since JavaScript numbers are floating-point values, they cannot precisely represent all 64-bit integer values. Therefore 64-bit and pointer-sized C/C++ values of numeric types do not autoconvert to JavaScript numbers. Instead they autoconvert to JavaScript objects of type <code>ctypes.Int64</code> and <code>ctypes.UInt64</code>.
 
<code>Int64</code> and <code>UInt64</code> objects are immutable.
 
It's not possible to do arithmetic <code>Int64Object</code>s using the standard arithmetic operators. JavaScript does not have operator overloading (yet). A few convenience functions are provided. (These types are intentionally feature-sparse so that they can be drop-in-replaced with a full-featured bignum type when JavaScript gets one.)
 
== Int64 ==
 
:'''<code>ctypes.Int64(''n'')</code>''' or '''<code>new ctypes.Int64(''n'')</code>''' - If ''n'' is an integer-valued number such that -2<sup>63</sup> &le; ''n'' &lt; 2<sup>63</sup>, return a sealed <code>Int64</code> object with that value. Otherwise if ''n'' is a string consisting of an optional minus sign followed by either decimal digits or <code>"0x"</code> or <code>"0X"</code> and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct an <code>Int64</code> object as above. Otherwise if ''n'' is an <code>Int64</code> or <code>UInt64</code> object, and represents a number within range, use the value to construct an <code>Int64</code> object as above. Otherwise throw a <code>TypeError</code>.
 
<code>Int64</code> objects have the following methods:
 
:'''<code>''i64''.toString(''[radix]'')</code>''' - If ''radix'' is omitted, assume 10.  Return a string representation of ''a'' in base ''radix'', consisting of a leading minus sign, if the value is negative, followed by one or more lowercase digits in base ''radix''.
 
:'''<code>''i64''.toSource()</code>''' - Return a string.  ''(This is provided for debugging purposes, and programs should not rely on details of the resulting string, which may change in the future.)''
 
The following functions are also provided:
 
:'''<code>ctypes.Int64.compare(''a'', ''b'')</code>''' - If ''a'' and ''b'' are both <code>Int64</code> objects, return <code>-1</code> if ''a'' &lt; ''b'', <code>0</code> if ''a'' = ''b'', and <code>1</code> if ''a'' &gt; ''b''. Otherwise throw a <code>TypeError</code>.
 
:'''<code>ctypes.Int64.lo(''a'')</code>''' - If ''a'' is an <code>Int64</code> object, return the low 32 bits of its value. (The result is an integer in the range 0 &le; ''result'' &lt; 2<sup>32</sup>.) Otherwise throw a <code>TypeError</code>.
 
:'''<code>ctypes.Int64.hi(''a'')</code>''' - If ''a'' is an <code>Int64</code> object, return the high 32 bits of its value (like <code>''a'' &gt;&gt; 32</code>). Otherwise throw a <code>TypeError</code>.
 
:'''<code>ctypes.Int64.join(''hi'', ''lo'')</code>''' - If ''hi'' is an integer-valued number in the range -2<sup>31</sup> &le; ''hi'' &lt; 2<sup>31</sup> and ''lo'' is an integer-valued number in the range 0 &le; ''lo'' &lt; 2<sup>32</sup>, return a sealed <code>Int64</code> object whose value is ''hi'' &times; 2<sup>32</sup> + ''lo''. Otherwise throw a <code>TypeError</code>.
 
== UInt64 ==
 
<code>UInt64</code> objects are the same except that the ''hi'' values are in the range 0 &le; ''hi'' &lt; 2<sup>32</sup> and the <code>.toString()</code> method never produces a minus sign.
 
= Conversions =
 
These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.
 
'''<code>ConvertToJS(''x'')</code>''' - This function is used to convert a <code>CData</code> object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible without loss of data or different behavior on different platforms, and a <code>CData</code> object otherwise. The precise rules are:
 
* If the type of ''x'' is <code>void</code>, return <code>undefined</code>.
 
* If the type of ''x'' is <code>bool</code>, return the corresponding JavaScript boolean.
 
* If ''x'' is of a number type but not a wrapped integer type, return the corresponding JavaScript number.
 
* If ''x'' is a signed wrapped integer type (<code>long</code>, <code>int64_t</code>, <code>ssize_t</code>, or <code>intptr_t</code>), return a <code>ctypes.Int64</code> object with value ''x''.
 
* If ''x'' is an unsigned wrapped integer type (<code>unsigned long</code>, <code>uint64_t</code>, <code>size_t</code>, or <code>uintptr_t</code>), return a <code>ctypes.UInt64</code> object with value ''x''.
 
* If ''x'' is of type <code>char16_t</code>, return a JavaScript string of length 1 containing the value of ''x'' (like <code>String.fromCharCode(x)</code>).
 
* If ''x'' is of any other character type, return the JavaScript number equal to its integer value. (This is sensitive to the signedness of the character type. Also, we assume no character types are so wide that they don't fit into a JavaScript number.)
 
* Otherwise ''x'' is of an array, struct, or pointer type. If the argument ''x'' is already a <code>CData</code> object, return it. Otherwise allocate a  buffer containing a copy of the C/C++ value ''x'', and return a <code>CData</code> object of the appropriate type referring to the object in the new buffer.
 
Note that null C/C++ pointers do not convert to the JavaScript <code>null</code> value.  ''(Open issue: Should we? Is there any value in retaining the type of a particular null pointer?)''
 
''(Arrays of characters do not convert to JavaScript strings. Rationale: Suppose <code>x</code> is a <code>CData</code> object of a struct type with a member <code>a</code> of type <code>char[10]</code>. Then <code>x.a[1]</code> should return the character in element 1 of the array, even if <code>x.a[0]</code> is a null character.  Likewise, <code>x.a[0] = '\0';</code> should modify the contents of the array. Both are possible only if <code>x.a</code> is a <code>CData</code> object of array type, not a JavaScript string.)''
 
<code>'''ImplicitConvert(''val'', ''t'')'''</code> - Convert the JavaScript value ''val'' to a C/C++ value of type ''t''.  This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to <code>''cdata''.value = ''val''</code>, or assigned to an array element or struct member, as in <code>''carray''[''i''] = ''val''</code> or <code>''cstruct''.''member'' = ''val''</code>.
 
This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.
 
C/C++ values of all supported types round trip through <code>ConvertToJS</code> and <code>ImplicitConvert</code> without any loss of data. That is, for any C/C++ value ''v'' of type ''t'', <code>ImplicitConvert(ConvertToJS(''v''),&nbsp;''t'')&nbsp;</code> produces a copy of ''v''.  ''(Note that not all JavaScript can round-trip to C/C++ and back in an analogous way. JavaScript primitive numbers can round-trip to <code>double</code> on all current platforms, <code>Int64</code> objects to <code>int64_t</code>, JavaScript booleans to <code>bool</code>, and so on. But some JavaScript values, such as functions, cannot be <code>ImplicitConvert</code>ed to any C/C++ type without loss of data.)''
 
''t'' must not be <code>void</code> or an array type with unspecified length.  ''(Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared <code>int f(int x[])</code> is <code>int *</code>, not <code>int[]</code>.)''
 
* First, if ''val'' is a <code>CData</code> object of type ''u'' and <code>SameType(''t'', ''u'')</code>, return the current value of the C/C++ object referred to by ''val''. Otherwise the behavior depends on the target type ''t''.
 
* If ''t'' is <code>ctypes.bool</code>:
:* If ''val'' is a boolean, return the corresponding C/C++ boolean value.
:* If ''val'' is the number +0 or -0, return <code>false</code>.
:* If ''val'' is the number 1, return <code>true</code>.
:* Otherwise fail.
 
* If ''t'' is a numeric type:
:* If ''val'' is a boolean, the result is a 0 or 1 of type ''t''.
:* If ''val'' is a <code>CData</code> object of a numeric type, and every value of that type is precisely representable in type ''t'', the result is a precise representation of the value of ''val'' in type ''t''.  (This is more conservative than the implicit integer conversions in C/C++ and more conservative than what we do if ''val'' is a JavaScript number. This is sensitive to the signedness of the two types.)
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.
:* If ''val'' is an <code>Int64</code> or <code>UInt64</code> object whose value can be exactly represented as a value of type ''t'', the result is that value.
:* If ''val'' is a number and ''t'' is a floating-point type, the result is the <code>jsdouble</code> represented by ''val'', cast to type ''t''.  (This can implicitly lose bits of precision. The rationale is to allow the user to pass values like 1/3 to <code>float</code> parameters.)
:* Otherwise fail.
 
* If ''t'' is <code>ctypes.char16_t</code>:
:* If ''val'' is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string. <code>''val''.charCodeAt(0)</code>.
:* If ''val'' is a number that can be exactly represented as a value of type <code>char16_t</code> (that is, an integer in the range 0 &le; ''val'' &lt; 2<sup>16</sup>), the result is that value.
:* Otherwise fail.


:The <nowiki>[[Class]]</nowiki> of a ctypes reference is <code>"Reference"</code>.
* If ''t'' is any other character type:
:* If ''val'' is a string:
::* If the 16-bit elements of ''val'' are not the UTF-16 encoding of a single Unicode character, fail.  ''(Open issue: If we support <code>wchar_t</code> we may want to allow unpaired surrogate code points to pass through without error.)''
::* If that Unicode character can be represented by a single character of type ''t'', the result is that character. ''(Open issue: Unicode conversions.)''
::* Otherwise fail.
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.  (This is sensitive to the signedness of ''t''.)
:* Otherwise fail.


:The prototype of a reference is the same as its type's <code>.prototype</code> property.
* If ''t'' is a pointer type:
:* If ''val'' is <code>null</code>, the result is a C/C++ <code>NULL</code> pointer of type ''t''.
:* If ''val'' is a <code>CData</code> object of array type ''u'' and either ''t'' is <code>ctypes.voidptr_t</code> or <code>SameType(''t''.targetType, ''u''.elementType)</code>, return a pointer to the first element of the array.
:* If ''t'' is <code>ctypes.voidptr_t</code> and ''val'' is a <code>CData</code> object of pointer type, return the value of the C/C++ pointer in ''val'', cast to <code>void *</code>.
:* Otherwise fail.  ''(Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)''


''(Implementation notes: A ctypes reference is a JSObject; it has a reserved slot that points to its type, a reserved slot that points to the backing buffer, and a pointer to the actual referenced location within the buffer. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. A ctypes buffer is a separate JSObject that has a pointer to a malloc'd buffer where the C++ data is stored. It doesn't have a pointer to a ctypes type. It has a finalizer that frees the buffer; references that point into the buffer keep the buffer alive, thanks to the reserved slot.)''
* If ''t'' is an array type:
:* If ''val'' is a JavaScript string:
::* If <code>''t''.elementType</code> is <code>char16_t</code> and <code>''t''.length &gt;= ''val''.length</code>, the result is an array of type ''t'' whose first <code>''val''.length</code> elements are the 16-bit elements of ''val''. If <code>''t''.length &gt; ''val''.length</code>, then element <code>''val''.length</code> of the result is a null character. The values of the rest of the array elements are unspecified.
::* If <code>''t''.elementType</code> is an 8-bit character type:
:::* If ''t'' is not well-formed UTF-16, fail.
:::* Let ''s'' = a sequence of bytes, the result of converting ''val'' from UTF-16 to UTF-8.
:::* Let ''n'' = the number of bytes in ''s''.
:::* If <code>''t''.length &lt; ''n''</code>, fail.
:::* The result is an array of type ''t'' whose first ''n'' elements are the 8-bit values in ''s''. If <code>''t''.length &gt; ''n''</code>, then element ''n'' of the result is 0. The values of the rest of the array elements are unspecified.
::* Otherwise fail.


== Pointers ==
:* If ''val'' is a JavaScript array object:
::* If <code>''val''.length</code> is not a nonnegative integer, fail.
::* If <code>''val''.length !== ''t''.length</code>, fail.
::* Otherwise, the result is a C/C++ array of <code>''val''.length</code> elements of type <code>''t''.elementType</code>. Element ''i'' of the result is <code>ImplicitConvert(''val''[''i''], ''t''.elementType)</code>.
:* Otherwise fail. ''(Rationale: The clause "If ''val'' is a JavaScript array object" requires some justification. If we allowed arbitrary JavaScript objects that resemble arrays, that would include CData objects of array type. Consequently, <code>arr1.value = arr2</code> where <code>arr1</code> is of type <code>ctypes.uint8_t.array(30)</code> and <code>arr2</code> is of type <code>ctypes.int.array(30)</code> would work as long as the values in <code>arr2</code> are small enough. We considered this conversion too astonishing and too error-prone.)''


js-ctypes pointers are very simple JavaScript objects that represent C/C++ pointers.  Like C/C++ pointers, js-ctypes pointers represent a memory address. They may point to valid memory, but they may also point off the end of an array, to memory that has been freed, to uninitialized or unmapped memory, or to data of a different type.
* Otherwise ''t'' is a struct type.
:* If ''val'' is a JavaScript object that is not a <code>CData</code> object:
::* If the enumerable own properties of ''val'' are exactly the names of the members of the struct ''t'', the result is a C/C++ struct of type ''t'', each of whose members is <code>ImplicitConvert(''val''[''the member name''], ''the type of the member'')</code>.
::* Otherwise fail.
:* Otherwise fail.


Like C/C++ pointers, js-ctypes pointers never protect the data they point to from garbage collection.
<code>'''ExplicitConvert(''val'', ''t'')'''</code> - Convert the JavaScript value ''val'' to a C/C++ value of type ''t'', a little more forcefully than <code>ImplicitConvert</code>.


It is hard to use (non-opaque) pointers safely, so js-ctypes is designed to support as many APIs as possible without requiring the use of pointers.  For example, if a C/C++ function takes a parameter that is a pointer to a struct, you can just pass it a struct, and ctypes will quietly take its address. ''(The implicit conversion rules will handle this.)''
This is called when a JavaScript value is passed as a parameter when calling a type, as in <code>''t''(''val'')</code> or <code>new ''t''(''val'')</code>.


These functions produce pointers:
* If <code>ImplicitConvert(''val'', ''t'')</code> succeeds, use that result. Otherwise:


:'''<code>ctypes.addressOf(''ref'')</code>''' - Return a pointer to the object referenced by ''ref''. If ''ref'' is not a ctypes reference, throw a <code>TypeError</code>.
* If ''t'' is <code>ctypes.bool</code>, the result is the C/C++ boolean value corresponding to <code>ToBoolean(''val'')</code>, where the operator <code>ToBoolean</code> is as defined in the ECMAScript standard.  ''(This is a bit less strict than the conversion behavior specified for numeric types below. This is just for convenience: the operators <code>&&</code> and <code>||</code>, which produce a boolean value in C/C++, do not always do so in JavaScript.)''


''(The rest of these strike me as targets of opportunity. Only certain unusual C APIs will need them.)''
* If ''t'' is an integer or character type and ''val'' is an infinity or NaN, the result is a 0 of type ''t''.


:'''<code>ctypes.addressOfField(''ref'', ''name'')</code>''' - Return a pointer to the named field of the struct referenced by ''ref''. If ''ref'' is not a reference to a struct, or the struct does not have a field with the given name, throw a <code>TypeError</code>.
* If ''t'' is an integer or character type and ''val'' is a finite number, the result is the same as casting the <code>jsdouble</code> value of ''val'' to type ''t'' with a C-style cast. ''(I think this basically means, start with ''val'', discard the fractional part, convert the integer part to a bit-pattern, and mask off whatever doesn't fit in type ''t''. But whatever C does is good enough for me. --jorendorff)''


:'''<code>ctypes.addressOfElement(''ref'', ''i'')</code>''' - Return a pointer to element ''i'' of the array referenced by ''ref''. If ''ref'' is not a reference to an array, or ''i'' is not a valid index into the array, throw a <code>TypeError</code>.
* If ''t'' is an integer or character type and ''val'' is an <code>Int64</code> or <code>UInt64</code> object, the result is the same as casting the <code>int64_t</code> or <code>uint64_t</code> value of ''val'' to type ''t'' with a C-style cast.


:'''<code>ctypes.castPointer(''t'', ''ptr'')</code>''' - Return a pointer of type ''t'' with the same bit-value as ''ptr''.  If ''t'' is not a pointer type or ''ptr'' is neither a pointer nor an integer, throw a <code>TypeError</code>.
* If ''t'' is a pointer type and ''val'' is a number, <code>Int64</code> object, or <code>UInt64</code> object that can be exactly represented as an <code>intptr_t</code> or <code>uintptr_t</code>, the result is the same as casting that <code>intptr_t</code> or <code>uintptr_t</code> value to type ''t'' with a C-style cast.


:'''<code>ctypes.pointerAdd(''ptr'', ''nelements'')</code>''' - Like the C expression <code>''ptr'' + ''nelements''</code>. Return a pointer of the same type as ''ptr'', adjusted by ''nelements'' * ''targetType.size'' bytes. If ''ptr'' is not a pointer or ''nelements'' is not an integer, throw a <code>TypeError</code>.
* If ''t'' is an integer type (not a character type) and ''val'' is a string consisting entirely of an optional minus sign, followed by either one or more decimal digits or the characters "0x" or "0X" and one or more hexadecimal digits, then the result is the same as casting the integer named by ''val'' to type ''t'' with a C-style cast.


In js-ctypes, as in C/C++, pointers are totally unchecked. There is no guaranteed-safe way to dereference a pointer. However, if the application knows that the pointer is valid, it can access the pointed-to data using this function:
* Otherwise fail.


:'''<code>ctypes.pointerToUnsafeReference(''ptr'')</code>''' - If ''ptr'' is not a pointer, or is a null pointer, throw a <code>TypeError</code>. Otherwise return a reference of type ''t'' pointing to the same location as ''ptr''. The new reference is safe to use only as long as ''ptr'' is a valid pointer. Unlike ordinary references, unsafe references do not protect the referent from garbage collection.
'''<code>SameType(''t'', ''u'')</code>''' - True if ''t'' and ''u'' represent the same C/C++ type.
*If ''t'' and ''u'' represent the same built-in type, even <code>void</code>, return true.
*If they are both pointer types, return <code>SameType(''t''.targetType, ''u''.targetType)</code>.
*If they are both array types, return <code>SameType(''t''.elementType, ''u''.elementType) &amp;&amp; ''t''.length === ''u''.length</code>.
*If they are both struct types, return <code>''t'' === ''u''</code>.
*Otherwise return false.


Pointers have the following method:
''(<code>SameType(int, int32_t)</code> is false. Rationale: As it stands, <code>SameType</code> behaves the same on all platforms. By making types match if they are typedef'd on the current platform, we could make e.g. <code>ctypes.int.ptr</code> and <code>ctypes.int32_t.ptr</code> compatible on platforms where we just have <code>typedef int int32_t</code>. But it was unclear how much that would matter in practice, balanced against cross-platform consistency. We might reverse this decision.)''


:'''<code>''ptr''.toString()</code>''' - Return a string of the form <code>"(''type'') 0x''hexdigits''"</code> where ''type'' is the name of ''ptr''<nowiki>'</nowiki>s target type and ''hexdigits'' consists of lowercase hexidecimal digits and is exactly 8 characters on 32-bit platforms and 16 characters on 64-bit platforms.
= Examples =
Cu.import("ctypes"); // imports the global ctypes object
// searches the path and opens "libmylib.so" on linux,
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);
// declares the C function:
//    int32_t myfunc(int32_t);
let myfunc = mylib.declare("myfunc", ctypes.default_abi,
    ctypes.int32_t, ctypes.int32_t);
let ret = myfunc(2); // calls myfunc


Minutiae: The <nowiki>[[Class]]</nowiki> of a pointer is <code>"Pointer"</code>. <code>ctypes.Pointer</code> is a function that takes two arguments, a pointer type and a pointer or number, and returns a new pointer object. Its <code>.prototype</code> property is read-only. <code>ctypes.Pointer.prototype</code> is a pointer. Its value is <code>NULL</code>. <code>ctypes.Pointer.prototype.constructor === ctypes.Pointer</code>. The prototype of <code>ctypes.Pointer.prototype</code> is <code>Object.prototype</code>. The prototype of every other pointer is <code>ctypes.Pointer.prototype</code>.
Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a <code>ctypes.int32_t</code> object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.


== Conversions ==
Here is how to create an object of type <code>int32_t</code>:


The '''implicit conversion rules''' are applied whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to <code>''ref''.assign(''val'')</code>, or assigned to an array element or struct member via a reference, as in <code>''arrayref''[''i''] = ''val''</code> or <code>''structref''.''member'' = ''val''</code>. These rules are intended to lose precision only when there is no reasonable alternative. They generally do not coerce values of one type to another type.
let i = new ctypes.int32_t; // new int32_t object with default value 0


''(TODO: precise rules.)''
This allocates a new C++ object of type <code>int32_t</code> (4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.


The '''explicit conversion rules''' are applied when a JavaScript value is passed as a parameter when calling a type, as in <code>''t''(''val'')</code> or <code>new ''t''(''val'')</code>. These rules are a bit more aggressive.
Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you:


''(TODO: precise rules.)''
let myfunc = mylib.declare("myfunc", ctypes.default_abi,
    ctypes.int32_t, ctypes.int32_t);
let ret = myfunc(i);
print(typeof ret); // The result is a JavaScript number.
'''number'''


'''<code>ConvertToJS(''x'')</code>''' - This function is used to convert a ctypes reference or a C++ return value to a JavaScript value. The intent is to return a primitive value or ctypes pointer whenever possible, and a ctypes reference otherwise. The precise rules are:
<code>ctypes.int32_t</code> is a <code>CType</code>. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined <code>CTypes</code> such as structs and pointers also - see later.)


* If the value is of type <code>void</code>, return <code>undefined</code>.
The object created by <code>new ctypes.int32_t</code> is called a <code>CData</code> object, and they are described in detail in the "<code>CData</code> objects" section above.


* If the value is of type <code>bool</code>, return the corresponding JavaScript boolean.
Opaque pointers:


* If the value is of a number type, return the corresponding JavaScript number. (In the case of 64-bit integer types, this can result in a loss of precision.)
// A new opaque pointer type.
FILE_ptr = new ctypes.StructType("FILE").ptr;
let fopen = mylib.declare("fopen", ctypes.default_abi,
    FILE_ptr, ctypes.char.ptr, ctypes.char.ptr);
let file = fopen("foo", "r");
if (file.isNull())
    throw "fopen failed";
file.contents(); // TypeError: type is unknown


* If the value is a null pointer (of a pointer or string type), return <code>null</code>.
''(Open issue: <code>fopen("foo", "r")</code> does not work under js-ctypes as currently specified.)''


* If the value is of a string type and non-null, return a JavaScript string.
Declaring a struct:


* If the value is of any other pointer type and non-null, return a ctypes pointer with the appropriate type and value.
// C prototype: struct s_t { int32_t a; int64_t b; };
const s_t = new ctypes.StructType("s_t", [{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, s_t);
let s = new s_t(10, 20);


* Otherwise the value is of an array or struct type. If the argument ''x'' is a ctypes reference, return it. Otherwise allocate a new buffer of the appropriate size, populate it with the C++ value ''x'', and return a ctypes reference to the complete object in the new buffer.
This creates an s_t object which allocates enough memory for the whole struct, creates getters and setters to access the binary fields via their offset, and assigns the values 10 and 20 to the fields. The new object's prototype is <code>s_t.prototype</code>.


== Examples ==
let i = myfunc(0, s); // checks the type of s


Basic types:
Nested structs:


  let i = new ctypes.uint32_t(5); // allocate sizeof(uint32_t) bytes, initialize to 5, and return a ctypes reference
  const u_t = ctypes.StructType("u_t", [{ x: Int64 }, { y: s_t }]);
  const setint = ctypes.declare("setint", ctypes.abi.default, ctypes.void_t, ctypes.PointerType(ctypes.uint32_t));
  let u = new u_t(5e4, s); // copies data from s into u.y - no references
  setint(i); // implicitly passes the address of allocated buffer
   
let u_field = u.y; // creates an s_t object that points directly to
                    // the offset of u.y within u.


  const getintp = ctypes.declare("getintp", ctypes.abi.default, ctypes.PointerType(ctypes.uint32_t));
An out parameter:
  let p = getintp(); // creates a ctypes pointer that holds the returned address
 
  let q = ctypes.castPointer(ctypes.Pointer(ctypes.uint8_t), p); // cast to uint8_t... why isn't this a method on Pointer?
// allocate sizeof(uint32_t)==4 bytes,
  let k = ctypes.pointerToUnsafeReference(q); // likewise?
// initialize to 5, and return a new CData object
let i = new ctypes.uint32_t(5);
// Declare a C function with an out parameter.
  const getint = ctypes.declare("getint", ctypes.abi.default,
    ctypes.void_t, ctypes.uint32_t.ptr);
getint(i.address()); // explicitly take the address of allocated buffer
 
(Python ctypes has <code>byref(i)</code> as an alternative to <code>i.address()</code>, but we do not expect users to do the equivalent of <code>from ctypes import *</code>, and <code>setint(ctypes.byref(i))</code> is a bit much.)
 
Pointers:
 
// Declare a C function that returns a pointer.
const getintp = ctypes.declare("getintp", ctypes.abi.default,
    ctypes.uint32_t.ptr);
  let p = getintp(); // A CData object that holds the returned uint32_t *
// cast from (uint32_t *) to (uint8_t *)
  let q = ctypes.cast(p, ctypes.uint8_t.ptr);
// first byte of buffer
  let b0 = q.contents(); // an integer, 0 <= b0 < 256


Struct fields:
Struct fields:


  const u_t = new ctypes.StructType('u_t', [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
  const u_t = new ctypes.StructType('u_t',
  let u = new u_t(5, 10); // allocates sizeof(2*uint32_t) and creates ctypes reference
    [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
// allocates sizeof(2*uint32_t) and creates a CData object
  let u = new u_t(5, 10);
  u.x = 7; // setter for u.x modifies field
  u.x = 7; // setter for u.x modifies field
  let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y) -> primitive value 10
  let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y)
print(i);    // ...which is the primitive number 10
'''10'''
  i = 5; // doesn't touch u.y
  i = 5; // doesn't touch u.y
 
print(u.y);
  const v_t = new ctypes.StructType('v_t', [[u_t, 'u'], [ctypes.uint32_t, 'z']]);
'''10'''
  let w = v.u; // ConvertToJS(reference to v.u) returns reference
  const v_t = new ctypes.StructType('v_t',
    [[u_t, 'u'], [ctypes.uint32_t, 'z']]);
// allocates 12 bytes, zeroes them out, and creates a CData object
let v = new v_t;
  let w = v.u; // ConvertToJS(reference to v.u) returns CData object
  w.x = 3; // invokes setter
  w.x = 3; // invokes setter
  setint(v.u.x); // TypeError - primitive is not a reference or pointer
  setint(v.u.x); // TypeError: setint argument 1 expects type uint32_t *, got int
  let p = ctypes.addressOfField(v.u, 'x'); // pointer to v.u.x
  let p = v.u.addressOfField('x'); // pointer to v.u.x
  setint(p); // ok - manually pass address
  setint(p); // ok - manually pass address
let q = v.u.addressOfField('x'); // abbreviated syntax?


64-bit integers: (check me!)
64-bit integers:
 
// Declare a function that returns a 64-bit unsigned int.
const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
    ctypes.uint64_t, ctypes.char.ptr);
// This autoconverts to a UInt64 object, not a JS number, even though the
// file is presumably much smaller than 4GiB. Converting to a different type
// each time you call the function, depending on the result value, would be
// worse.
let s = getfilesize("/usr/share/dict/words");
print(s instanceof ctypes.UInt64);
'''true'''
print(s < 1000000);    // Because s is an object, not a number,
'''false'''            // JS lies to you.
print(s >= 1000000);  // Neither of these is doing what you want,
'''false'''            // as evidenced by the bizarre answers.
print(s);              // It has a nice .toString() method at least!
'''931467'''
// There is no shortcut. To get an actual JS number out of a
// 64-bit integer, you have to use the ctypes.{Int64,UInt64}.{hi,lo}
// functions.
print(ctypes.UInt64.lo(s))
'''931467'''
// (OK, I lied. There is a shortcut. You can abuse the .toString() method.
// WARNING: This can lose precision!)
print(Number(s.toString()))
'''931467'''
let i = new ctypes.int64_t(5);  // a new 8-byte buffer
let j = i;  // another variable referring to the same CData object
j.value = 6; // invokes setter on i, auto-promotes 6 to Int64
print(typeof j.value)  // but j.value is still an Int64 object
'''object'''
print(j.value instanceof ctypes.Int64)
'''true'''
print(j.value);
'''6'''
const m_t = new ctypes.StructType(
    'm_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]);
let m = new m_t;
const getint64 = ctypes.declare("getint64", ctypes.abi.default,
    ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
getint64(m.x); // TypeError: getint64 argument 1 expected type int64_t *,
                // got Int64 object
                // (because m.x's getter autoconverts to an Int64 object)
getint64(ctypes.addressOfField(m, 'x')); // works
 
''(Open issue: As above, the implicit conversion from JS string to <code>char *</code> in <code>getfilesize("/usr/share/dict/words")</code> does not work in js-ctypes as specified.)''
 
''(TODO - make this a real example:)''
let i1 = ctypes.int32_t(5);
let i2 = ctypes.int32_t();
i2.value = i1  // i2 and i1 have separate binary storage, this is memcpy
//you can copy the guts of one struct to another, etc.
 
=Future directions=
 
==Callbacks==
 
The libffi part of this is presumably not too bad. Issues:
 
'''Lifetimes.''' C/C++ makes it impossible to track an object pointer. Both JavaScript's GC and experience with C/C++ function pointers will tend to discourage users from caring about function lifetimes.
 
I think the best solution to this problem is to put the burden of keeping the function alive entirely on the client.
 
'''Finding the right context to use.''' If we burn the cx right into the libffi closure, it will crash when called from a different thread or after the cx is destroyed. If we take a context at random from some internal JSAPI structure, it might be thread-safe, but the context's options and global will be random, which sounds dangerous. Perhaps ctypes itself can create a context per thread, on demand, for the use of function pointers. In a typical application, that would only create one context, if any.
 
==Converting strings==
 
I think we want an explicit API for converting strings, very roughly:
 
<code>CData</code> objects of certain pointer and array types have methods for reading and writing Unicode strings. These methods are present if the target or element type is an 8-bit character or integer type.
 
'''<code>''cdata''.readString(''[encoding[, length]]'')</code>''' - Read bytes from ''cdata'' and convert them to Unicode characters using the specified ''encoding'', returning a string. Specifically:
* If ''cdata'' is an array, let ''p'' = a pointer to the first element. Otherwise ''cdata'' is a pointer; let ''p'' = the value of ''cdata''.
* If ''encoding'' is <code>undefined</code> or omitted, the selected encoding is UTF-8. Otherwise, if ''encoding'' is a string naming a known character encoding, that encoding is selected. Otherwise throw a <code>TypeError</code>.
* If ''length'' is a size value, ''cdata'' is an array, and <code>''length'' &gt; ''cdata''.length</code>, then throw a <code>TypeError</code>.
* Otherwise, if ''length'' is a size value, take exactly ''length'' bytes starting at ''p'' and convert them to Unicode characters according to the selected encoding. ''(Open issue: Error handling.)'' Return a JavaScript string containing the Unicode characters, represented in UTF-16.  ''(The result may contain null characters.)''
* Otherwise, if ''length'' is <code>undefined</code> or omitted, convert bytes starting at ''p'' to Unicode characters according to the selected encoding. Stop when the end of the array is reached (if ''cdata'' is an array) or when a null character (U+0000) is found. ''(Open issue: Error handling.)'' Return a JavaScript string containing the Unicode characters, represented in UTF-16.  ''(If ''cdata'' is a pointer and there is no trailing null character, this can crash.)''
* Otherwise throw a <code>TypeError</code>.
 
'''<code>''cdata''.writeString(''s'', ''[encoding[, length]]'')</code>''' - Determine the starting pointer ''p'' as above. If ''s'' is not a well-formed UTF-16 string, throw a <code>TypeError</code>.  ''(Open issue: Error handling.)'' Otherwise convert ''s'' to bytes in the specified ''encoding'' (default: UTF-8) and write at most ''length'' - 1 bytes, or all the converted bytes, if ''length'' is <code>undefined</code> or omitted, to memory starting at ''p''. Write a converted null character after the data. Return the number of bytes of data written, not counting the terminating null character.
 
''(Open issue: ''<code>''cdata''.writeString(...)</code>'' is awkward for the case where you want an autosized <code>ctypes.char.array()</code> to hold the converted data. If <code>''cdata''</code> happens to be too small for the resulting string, and you don't supply ''length'', you crash; and if you do supply ''length'', you don't know whether conversion was halted because the target array was of insufficient length.)''
 
''(Open issue: As proposed, these are not suitable for working with encodings where a zero byte might not indicate the end of text. For example, a string encoded in UTF-16 will typically contain a lot of zero bytes. Unfortunately, in the case of readString, the underlying library demands the length up front.)''
 
''(Open issue: These methods offer no error handling options, which is pretty weak. Real-world code often wants to allow a few characters to be garbled rather than fail. For now we will likely be limited to whatever the underlying codec library, <code>nsIScriptableUnicodeConverter</code>, can do.)''
 
''(Open issue: 16-bit versions too, for UTF-16?)''
 
==isNull==
 
If we do not convert NULL pointers to JS <code>null</code> (and I may have changed my mind about this) then we need:
 
'''<code>''cptr''.isNull()</code>''' - Return <code>true</code> if ''cptr''<nowiki>'</nowiki>s value is a null pointer, <code>false</code> otherwise.
 
==Auto-converting strings==
 
There are several issues:
 
'''Lifetimes.''' This problem arises when autoconverting from JS to C/C++ only.
 
When passing a string to a foreign function, like <code>foo(s)</code>, what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing <code>s</code> for the duration of the call. But then there are situations like
 
TenStrings = char.ptr.array(10);
var arr = new TenStrings();
arr[0] = s;  // What is the lifetime of the data arr[0] points to?
 
The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.
 
'''Non-null-terminated strings.''' This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).
 
In C/C++, the type <code>char *</code> effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing <code>char *</code> pointers that aren't logically strings). The workaround would be to declare them as a different type.
 
'''Unicode.''' This problem does not apply to conversions between JS strings and <code>char16_t</code> arrays or pointers; only <code>char</code> arrays or pointers.
 
Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and <code>char</code> strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings and, for those, <code>char16_t</code> is the right thing.


// want to represent 64-bit ints as references always, rather than
'''Casting away const.''' This problem arises only when converting from a JS string to a C/C++ pointer typeThe string data must not be modified, but the C/C++ types <code>char *</code> and <code>char16_t *</code> suggest that the referent might be modified.
// autoconverting to an int/double primitive, to avoid loss of precision.
// use the same behavior for size_t and ptrdiff_t.
let i = new ctypes.int64_t(5);
let j = i;
j = 6; // invokes setter on i
const m_t = new ctypes.StructType('m_t', [[ctypes.int64_t, 'x']]);
let m = new m_t(7);
const setint64 = ctypes.declare("setint64", ctypes.abi.default, ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
setint64(m.x); // ok - unlike int32_t case, ConvertToJS returns a reference to the field m.x
  setint64(ctypes.addressOfField(m, 'x')); // also works, per int32_t case

Latest revision as of 04:23, 30 September 2014

js-ctypes is a library for calling C/C++ functions from JavaScript without having to write or generate any C/C++ "glue code".

js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.

Libraries

ctypes.open(name) - Open a library. (TODO: all the details) This always returns a Library object or throws an exception.

Library objects have the following methods:

lib.declare(name, abi, rtype, [argtype1, ...]) - Declare a function. (TODO: all the details) This always returns a new callable CData object representing a function pointer to name, or throws an exception.
If rtype is an array type, this throws a TypeError.
If any argtypeN is an array type, the result is the same as if it had been the corresponding pointer type, argtypeN.elementType.ptr. (Rationale: This is how C and C++ treat array types in function declarations.)

(TODO: Explain what happens when you call a declared function. In brief: It uses ImplicitConvert to convert the JavaScript arguments to C and ConvertToJS to convert the return value to JS.)

Types

A type maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.

(Types and their prototypes are extensible: scripts can add new properties to them. Rationale: This is how most JavaScript constructors behave.)

Built-in types

ctypes provides the following types:

ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t - Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).
Since some 64-bit values are outside the range of the JavaScript number type, ctypes.int64_t and ctypes.uint64_t do not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper types ctypes.Int64 and ctypes.UInt64 (which are JavaScript object types, not CTypes). See "64-bit integer objects" below.
ctypes.size_t, ssize_t, intptr_t, uintptr_t - Primitive types whose size depends on the platform. (These types do not autoconvert to JavaScript numbers. Instead they convert to wrapper objects, even on 32-bit platforms. See "64-bit integer objects" below. Rationale: On 64-bit platforms, there are values of these types that cannot be precisely represented as JS numbers. It will be easier to write code that works on multiple platforms if the builtin types autoconvert in the same way on all platforms.)
ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double - Types that behave like the corresponding C types. As in C, unsigned is always an alias for unsigned_int.
(ctypes.long and ctypes.unsigned_long autoconvert to 64-bit integer objects on all platforms. The rest autoconvert to JavaScript numbers. Rationale: Some platforms have 64-bit long and some do not.)
ctypes.char, ctypes.signed_char, ctypes.unsigned_char - Character types that behave like the corresponding C types. (These are very much like int8_t and uint8_t, but they differ in some details of conversion. For example, ctypes.char.array(30)(str) converts the string str to UTF-8 and returns a new CData object of array type.)
ctypes.char16_t - A 16-bit unsigned character type representing a UTF-16 code unit. (This is distinct from uint16_t in details of conversion behavior. js-ctypes autoconverts C char16_ts to JavaScript strings of length 1.) For backwards compatibility, ctypes.jschar is an alias for char16_t.
ctypes.void_t - The special C type void. This can be used as a return value type. (void is a keyword in JavaScript.)
ctypes.voidptr_t - The C type void *.

The wrapped integer types are the types int64_t, uint64_t, size_t, ssize_t, intptr_t, uintptr_t, long, and unsigned_long. These are the types that autoconvert to 64-bit integer objects rather than to primitive JavaScript numbers.

User-defined types

Starting from the builtin types above, these functions can be used to create additional types:

new ctypes.PointerType(t) - If t is a CType, return the type "pointer to t". The result is cached so that future requests for this pointer type produce the same CType object. If t is a string, instead return a new opaque pointer type named t. Otherwise throw a TypeError.
new ctypes.FunctionType(abi, rt, [ at1, ... ]) - Return a function pointer CType corresponding to the C type rt (*) (at1, ...), where abi is a ctypes ABI type and rt and at1, ... are CTypes. Otherwise throw a TypeError.
new ctypes.ArrayType(t) - Return an array type with unspecified length and element type t. If t is not a type or t.size is undefined, throw a TypeError.
new ctypes.ArrayType(t, n) - Return the array type t[n]. If t is not a type or t.size is undefined or n is not a size value (defined below), throw a TypeError. If the size of the resulting array type, in bytes, would not be exactly representable both as a size_t and as a JavaScript number, throw a RangeError.
A size value is either a non-negative, integer-valued primitive number, an Int64 object with a non-negative value, or a UInt64 object.
(Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)
new ctypes.StructType(name, fields) - Create a new struct type with the given name and fields. fields is an array of field descriptors, of the format
[ { field1: type1 }, { field2: type2 }, ... ]
where fieldn is a string denoting the name of the field, and typen is a ctypes type. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If name is not a string, or any typen is such that typen.size is undefined, throw a TypeError. If the size of the struct, in bytes, would not be exactly representable both as a size_t and as a JavaScript number, throw a RangeError.

(Open issue: Specify a way to tell ctypes.StructType to use #pragma pack(n).)

These constructors behave exactly the same way when called without the new keyword.

Examples:

const DWORD = ctypes.uint32_t;
const HANDLE = new ctypes.PointerType("HANDLE");
const HANDLES = new ctypes.ArrayType(HANDLE);
const FILE = new ctypes.StructType("FILE").ptr;
const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);

const struct_tm = new ctypes.StructType('tm', [{'tm_sec': ctypes.int}, ...]);

const comparator_t = new ctypes.FunctionType(ctypes.default_abi, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);

Properties of types

All the fields described here are read-only.

All types have these properties and methods:

t.size - The C/C++ sizeof the type, in bytes. The result is a primitive number, not a UInt64 object.
If t is an array type with unspecified length, t.size is undefined.
ctypes.void_t.size is undefined.
t.name - A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
For primitive types this is just the name of the corresponding C/C++ type.
For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other function, pointer, and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
(Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)
ctypes.int32_t.name
  ===> "int32_t"
ctypes.void_t.name
  ===> "void"
ctypes.char16_t.ptr.name
  ===> "char16_t *"

const FILE = new ctypes.StructType("FILE").ptr;
FILE.name
  ===> "FILE*"

const fn_t = new ctypes.FunctionType(ctypes.stdcall, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);
fn_t.name
  ===> "int (__stdcall *)(void*, void*)"

const struct_tm = new ctypes.StructType("tm", [{tm_sec: ctypes.int}, ...]);
struct_tm.name
  ===> "tm"

// Pointer-to-array types are not often used in C/C++.
// Such types have funny-looking names.
const ptrTo_ptrTo_arrayOf4_strings =
    new ctypes.PointerType(
      new ctypes.PointerType(
        new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4)));
ptrTo_ptrTo_arrayOf4_strings.name
  ===> "char *(**)[4]"
t.ptr - Return ctypes.PointerType(t).
t.array() - Return ctypes.ArrayType(t).
t.array(n) - Return ctypes.ArrayType(t, n).
Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;
(.array() requires parentheses but .ptr doesn't. Rationale: .array() has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between: int [10] --> ctypes.int.array(10). Writing a pointer type does not require the brackets.)
t.toString() - Return "type " + t.name.
t.toSource() - Return a JavaScript expression that evaluates to a CType describing the same C/C++ type as t.
ctypes.uint32_t.toSource()
  ===> "ctypes.uint32_t"
ctypes.string.toSource()
  ===> "ctypes.string"

const charPtr = new ctypes.PointerType(ctypes.char);
charPtr.toSource()
  ===> "ctypes.char.ptr"

const Point = new ctypes.StructType(
    "Point", [{x: ctypes.int32_t}, {y: ctypes.int32_t}]);
Point.toSource()
  ===> "ctypes.StructType("Point", [{x: ctypes.int32_t}, {y: ctypes.int23_t}])"

Pointer types also have:

t.targetType - Read-only. The pointed-to type, or null if t is an opaque pointer type.

Function types also have:

t.abi - Read-only. The ABI of the function; one of the ctypes ABI objects.
t.returnType - Read-only. The return type.
t.argTypes - Read-only. A sealed array of argument types.

Struct types also have:

t.fields - Read-only. A sealed array of field descriptors. (TODO: Details.)

Array types also have:

t.elementType - The type of the elements of an array of this type. E.g. IOBuf.elementType === ctypes.uint8_t.
t.length - The number of elements, a non-negative integer; or undefined if this is an array type with unspecified length. (The result, if not undefined, is a primitive number, not a UInt64 object. Rationale: Having .length produce anything other than a number is foreign to JS, and arrays of more than 253 elements are currently unheard-of.)

Minutiae:

ctypes.CType is the abstract-base-class constructor of all js-ctypes types. If called, it throws a TypeError. (This is exposed in order to expose ctypes.CType.prototype.)
The [[Class]] of a ctypes type is "CType".
The [[Class]] of the type constructors ctypes.{C,Array,Struct,Pointer}Type is "Function".
Every CType has a read-only, permanent .prototype property. The type-constructors ctypes.{C,Pointer,Struct,Array}Type each have a read-only, permanent .prototype property as well.
Types have a hierarchy of prototype objects. The prototype of ctypes.CType.prototype is Function.prototype. The prototype of ctypes.{Array,Struct,Pointer,Function}Type.prototype and of all the builtin types except ctypes.voidptr_t is ctypes.CType.prototype. The prototype of an array type is ctypes.ArrayType.prototype. The prototype of a struct type is ctypes.StructType.prototype. The prototype of a pointer type is ctypes.PointerType.prototype. The prototype of a function type is ctypes.FunctionType.prototype.
Every CType t has t.prototype.constructor === t; that is, its .prototype has a read-only, permanent, own .constructor property that refers to the type. The same is true of the five type constructors ctypes.{C,Array,Struct,Pointer,Function}Type.

Calling types

CTypes are JavaScript constructors. That is, they are functions, and they can be called to create new objects. (The objects they create are called CData objects, and they are described in the next section.)

new t or new t() or t() - Create a new CData object of type t.
Without arguments, these allocate a new buffer of t.size bytes, populate it with zeroes, and return a new CData object referring to the complete object in that buffer.
If t.size is undefined, this throws a TypeError.
new t(val) or t(val) - Create a new CData object as follows:
  • If t.size is not undefined: Convert val to type t by calling ExplicitConvert(val, t), throwing a TypeError if the conversion is impossible. Allocate a new buffer of t.size bytes, populated with the converted value. Return a new CData object of type t referring to the complete object in that buffer. (When val is a CData object of type t, the behavior is like malloc followed by memcpy.)
  • If t is an array type of unspecified length:
  • If val is a size value (defined above): Let u = ArrayType(t.elementType, val) and return new u.
  • If t.elementType is char16_t and val is a string: Return a new CData object of type ArrayType(ctypes.char16_t, val.length + 1) containing the contents of val followed by a null character.
  • If t.elementType is an 8-bit character type and val is a string: If val is not a well-formed UTF-16 string, throw a TypeError. Otherwise, let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8, and let n = the number of bytes in s. Return a new CData object of type ArrayType(t.elementType, n + 1) containing the bytes in s followed by a null character.
  • If val is a JavaScript array object and val.length is a nonnegative integer, let u = ArrayType(t.elementType, val.length) and return new u(val). (Array CData objects created in this way have cdata.constructor === u, not t. Rationale: For all CData objects, cdata.constructor.size gives the size in bytes, unless a struct field shadows cdata.constructor.)
  • Otherwise, throw a TypeError.
  • Otherwise, t is void_t. Throw a TypeError.
let a_t = ctypes.ArrayType(ctypes.int32_t);
let a = new a_t(5);
a.length
  ===> 5
a.constructor.size
  ===> 20

CData objects

A CData object represents a C/C++ value located in memory. The address of the C/C++ value can be taken (using the .address() method), and it can be assigned to (using the .value property).

Every CData object has a type, the CType object that describes the type of the C/C++ value.

Minutiae:

The [[Class]] of a CData object is "CData".
The prototype of a CData object is the same as its type's .prototype property.

(Implementation notes: A CData object has a reserved slot that points to its type; a reserved slot that contains null if the object owns its own buffer, and otherwise points to the base CData object that owns the backing buffer where the data is stored; and a data pointer. The data pointer points to the actual location within the buffer of the C/C++ object to which the CData object refers. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it. Other CData objects that point into the buffer keep the base CData, and therefore the underlying buffer, alive.)

Properties and methods of CData objects

All CData objects have these methods and properties:

cdata.address() - Return a new CData object of the pointer type ctypes.PointerType(cdata.constructor) whose value points to the C/C++ object referred to by cdata.
(Open issue: Does this pointer keep cdata alive? Currently not but we could easily change it. It is impossible to have all pointers keep their referents alive in a totally general way--consider pointers embedded in structs and arrays. But this special case would be pretty easy to hack: put a .contents property on the resulting pointer, referring back to cdata.)
cdata.constructor - Read-only. The type of cdata. (This is never void_t or an array type with unspecified length. Implementation note: The prototype of cdata is an object that has a read-only constructor property, as detailed under "minutiae".)
cdata.toSource() - Return the string "t(arg)" where t and arg are implementation-defined JavaScript expressions (intended to represent the type of cdata and its value, respectively). The intent is that eval(cdata.toSource()) should ideally produce a new CData object containing a copy of cdata, but this can only work if the type of cdata happens to be bound to an appropriate name in scope.
cdata.toString() - Return the same string as cdata.toSource().

The .value property has a getter and a setter:

cdata.value - Let x = ConvertToJS(cdata). If x === cdata, throw a TypeError. Otherwise return x.
cdata.value = val - Let cval = ImplicitConvert(val, cdata.constructor). If conversion fails, throw a TypeError. Otherwise assign the value cval to the C/C++ object referred to by cdata.

Structs

CData objects of struct types also have this method:

cstruct.addressOfField(name) - Return a new CData object of the appropriate pointer type, whose value points to the field of cstruct with the name name. If name is not a JavaScript string or does not name a member of cstruct, throw a TypeError.

They also have getters and setters for each struct member:

cstruct.member - Let F be a CData object referring to the struct member. Return ConvertToJS(F).
cstruct.member = val - Let cval = ImplicitConvert(val, the type of the member). If conversion fails, throw a TypeError. Otherwise store cval in the appropriate member of the struct.

These getters and setters can shadow the properties and methods described above.

Pointers

CData objects of pointer types also have this property:

cptr.contents - Let C be a CData object referring to the pointed-to contents of cptr. Return ConvertToJS(C).
cptr.contents = val - Let cval = ImplicitConvert(val, the base type of the pointer). If conversion fails, throw a TypeError. Otherwise store cval in the pointed-to contents of cptr.

Functions

CData objects of function types are callable:

let result = cfn(arg1, ...) - Let (carg1, ...) be CData objects representing the arguments to the C function cfn, and cresult be a CData object representing its return value. Let cargn = ImplicitConvert(argn, the type of the argument), and let result = ConvertToJS(cresult). Call the C function with arguments represented by (carg1, ...), and store the result in cresult. If conversion fails, throw a TypeError.

Arrays

Likewise, CData objects of array types have getters and setters for each element. Arrays additionally have a length property.

Note that these getters and setters are only present for integers i in the range 0 ≤ i < carray.length. (Open issue: can we arrange to throw an exception if i is out of range?)

carray[i] - Let E be a CData object referring to the element at index i. Return ConvertToJS(E).
carray[i] = val - Let cval = ImplicitConvert(val, carray.elementType). If conversion fails, throw a TypeError. Otherwise store cval in element i of the array.
carray.length - Read-only. The length of the array as a JavaScript number. (The same as carray.constructor.length. This is not a UInt64 object. Rationale: Array CData objects should behave like other array-like objects for easy duck typing.)
carray.addressOfElement(i) - Return a new CData object of the appropriate pointer type (ctypes.PointerType(carray.constructor.elementType)) whose value points to element i of carray. If i is not a JavaScript number that is a valid index of carray, throw a TypeError.

(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)

Aliasing

Note that it is possible for several CData objects to refer to the same or overlapping memory. (In this way CData objects are like C++ references.) For example:

const Point = new ctypes.StructType(
    "Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]);
const Rect = new ctypes.StructType(
    "Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]);

var r = Rect();     // a new CData object of type Rect
var p = r.topLeft;  // refers to the topLeft member of r, not a copy
r.topLeft.x = 100;  // This would not work if `r.topLeft` was a copy!
r.topLeft.x
  ===> 100          // It works...
p.x                 // and p refers to the same C/C++ object...
  ===> 100          // so it sees the change as well.

r.toSource()
  ===> "Rect({topLeft: {x: 100, y: 0}, bottomRight: {x: 0, y: 0}})"

p.x = 1.0e90;       // Assigning a value out of range is an error.
  **** TypeError

// The range checking is great, but it can have surprising
// consequences sometimes:
p.x = 0x7fffffff;   // (the maximum int32_t value)
p.x++;              // p.x = 0x7fffffff + 1, which is out of range...
  **** TypeError    // ...so this fails, leaving p.x unchanged.
// But JS code doesn't need to do that very often.
// To make this to roll around to -0x80000000, you could write:
p.x = (p.x + 1) | 0; // In JS, `x|0` truncates a number to int32.

Casting

ctypes.cast(cdata, t) - Return a new CData object which points to the same memory block as cdata, but with type t. If t.size is undefined or larger than cdata.constructor.size, throw a TypeError. This is like a C cast or a C++ reinterpret_cast.

Equality

According to the ECMAScript standard, if x and y are two different objects, then x === y and x == y are both false. This has consequences for code that uses js-ctypes pointers, pointer-sized integers, or 64-bit integers, because all these values are represented as JavaScript objects. In C/C++, the == operator would compare values of these types for equality. Not so in js-ctypes:

const HANDLE = new ctypes.PointerType("HANDLE");
const INVALID_HANDLE_VALUE = HANDLE(-1);
const kernel32 = ctypes.open("kernel32");
const CreateMutex = kernel32.declare("CreateMutex", ...);

var h = CreateMutex(null, false, null);
if (h == INVALID_HANDLE_VALUE)   // BAD - always false
    ...

This comparison is always false because CreateMutex returns a new CData object, which of course will be a different object from the existing value of INVALID_HANDLE_VALUE.

(Python ctypes has the same issue. It isn't mentioned in the docs, but:

>>> from ctypes import *
>>> c_void_p(0) == c_void_p(0)
False
>>> c_int(33) == c_int(33)
False

We could overload operator== using the nonstandard hook JSExtendedClass.equality but it might not be worth it.)

64-bit integer objects

Since JavaScript numbers are floating-point values, they cannot precisely represent all 64-bit integer values. Therefore 64-bit and pointer-sized C/C++ values of numeric types do not autoconvert to JavaScript numbers. Instead they autoconvert to JavaScript objects of type ctypes.Int64 and ctypes.UInt64.

Int64 and UInt64 objects are immutable.

It's not possible to do arithmetic Int64Objects using the standard arithmetic operators. JavaScript does not have operator overloading (yet). A few convenience functions are provided. (These types are intentionally feature-sparse so that they can be drop-in-replaced with a full-featured bignum type when JavaScript gets one.)

Int64

ctypes.Int64(n) or new ctypes.Int64(n) - If n is an integer-valued number such that -263n < 263, return a sealed Int64 object with that value. Otherwise if n is a string consisting of an optional minus sign followed by either decimal digits or "0x" or "0X" and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct an Int64 object as above. Otherwise if n is an Int64 or UInt64 object, and represents a number within range, use the value to construct an Int64 object as above. Otherwise throw a TypeError.

Int64 objects have the following methods:

i64.toString([radix]) - If radix is omitted, assume 10. Return a string representation of a in base radix, consisting of a leading minus sign, if the value is negative, followed by one or more lowercase digits in base radix.
i64.toSource() - Return a string. (This is provided for debugging purposes, and programs should not rely on details of the resulting string, which may change in the future.)

The following functions are also provided:

ctypes.Int64.compare(a, b) - If a and b are both Int64 objects, return -1 if a < b, 0 if a = b, and 1 if a > b. Otherwise throw a TypeError.
ctypes.Int64.lo(a) - If a is an Int64 object, return the low 32 bits of its value. (The result is an integer in the range 0 ≤ result < 232.) Otherwise throw a TypeError.
ctypes.Int64.hi(a) - If a is an Int64 object, return the high 32 bits of its value (like a >> 32). Otherwise throw a TypeError.
ctypes.Int64.join(hi, lo) - If hi is an integer-valued number in the range -231hi < 231 and lo is an integer-valued number in the range 0 ≤ lo < 232, return a sealed Int64 object whose value is hi × 232 + lo. Otherwise throw a TypeError.

UInt64

UInt64 objects are the same except that the hi values are in the range 0 ≤ hi < 232 and the .toString() method never produces a minus sign.

Conversions

These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.

ConvertToJS(x) - This function is used to convert a CData object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible without loss of data or different behavior on different platforms, and a CData object otherwise. The precise rules are:

  • If the type of x is void, return undefined.
  • If the type of x is bool, return the corresponding JavaScript boolean.
  • If x is of a number type but not a wrapped integer type, return the corresponding JavaScript number.
  • If x is a signed wrapped integer type (long, int64_t, ssize_t, or intptr_t), return a ctypes.Int64 object with value x.
  • If x is an unsigned wrapped integer type (unsigned long, uint64_t, size_t, or uintptr_t), return a ctypes.UInt64 object with value x.
  • If x is of type char16_t, return a JavaScript string of length 1 containing the value of x (like String.fromCharCode(x)).
  • If x is of any other character type, return the JavaScript number equal to its integer value. (This is sensitive to the signedness of the character type. Also, we assume no character types are so wide that they don't fit into a JavaScript number.)
  • Otherwise x is of an array, struct, or pointer type. If the argument x is already a CData object, return it. Otherwise allocate a buffer containing a copy of the C/C++ value x, and return a CData object of the appropriate type referring to the object in the new buffer.

Note that null C/C++ pointers do not convert to the JavaScript null value. (Open issue: Should we? Is there any value in retaining the type of a particular null pointer?)

(Arrays of characters do not convert to JavaScript strings. Rationale: Suppose x is a CData object of a struct type with a member a of type char[10]. Then x.a[1] should return the character in element 1 of the array, even if x.a[0] is a null character. Likewise, x.a[0] = '\0'; should modify the contents of the array. Both are possible only if x.a is a CData object of array type, not a JavaScript string.)

ImplicitConvert(val, t) - Convert the JavaScript value val to a C/C++ value of type t. This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to cdata.value = val, or assigned to an array element or struct member, as in carray[i] = val or cstruct.member = val.

This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.

C/C++ values of all supported types round trip through ConvertToJS and ImplicitConvert without any loss of data. That is, for any C/C++ value v of type t, ImplicitConvert(ConvertToJS(v), t produces a copy of v. (Note that not all JavaScript can round-trip to C/C++ and back in an analogous way. JavaScript primitive numbers can round-trip to double on all current platforms, Int64 objects to int64_t, JavaScript booleans to bool, and so on. But some JavaScript values, such as functions, cannot be ImplicitConverted to any C/C++ type without loss of data.)

t must not be void or an array type with unspecified length. (Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared int f(int x[]) is int *, not int[].)

  • First, if val is a CData object of type u and SameType(t, u), return the current value of the C/C++ object referred to by val. Otherwise the behavior depends on the target type t.
  • If t is ctypes.bool:
  • If val is a boolean, return the corresponding C/C++ boolean value.
  • If val is the number +0 or -0, return false.
  • If val is the number 1, return true.
  • Otherwise fail.
  • If t is a numeric type:
  • If val is a boolean, the result is a 0 or 1 of type t.
  • If val is a CData object of a numeric type, and every value of that type is precisely representable in type t, the result is a precise representation of the value of val in type t. (This is more conservative than the implicit integer conversions in C/C++ and more conservative than what we do if val is a JavaScript number. This is sensitive to the signedness of the two types.)
  • If val is a number that can be exactly represented as a value of type t, the result is that value.
  • If val is an Int64 or UInt64 object whose value can be exactly represented as a value of type t, the result is that value.
  • If val is a number and t is a floating-point type, the result is the jsdouble represented by val, cast to type t. (This can implicitly lose bits of precision. The rationale is to allow the user to pass values like 1/3 to float parameters.)
  • Otherwise fail.
  • If t is ctypes.char16_t:
  • If val is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string. val.charCodeAt(0).
  • If val is a number that can be exactly represented as a value of type char16_t (that is, an integer in the range 0 ≤ val < 216), the result is that value.
  • Otherwise fail.
  • If t is any other character type:
  • If val is a string:
  • If the 16-bit elements of val are not the UTF-16 encoding of a single Unicode character, fail. (Open issue: If we support wchar_t we may want to allow unpaired surrogate code points to pass through without error.)
  • If that Unicode character can be represented by a single character of type t, the result is that character. (Open issue: Unicode conversions.)
  • Otherwise fail.
  • If val is a number that can be exactly represented as a value of type t, the result is that value. (This is sensitive to the signedness of t.)
  • Otherwise fail.
  • If t is a pointer type:
  • If val is null, the result is a C/C++ NULL pointer of type t.
  • If val is a CData object of array type u and either t is ctypes.voidptr_t or SameType(t.targetType, u.elementType), return a pointer to the first element of the array.
  • If t is ctypes.voidptr_t and val is a CData object of pointer type, return the value of the C/C++ pointer in val, cast to void *.
  • Otherwise fail. (Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)
  • If t is an array type:
  • If val is a JavaScript string:
  • If t.elementType is char16_t and t.length >= val.length, the result is an array of type t whose first val.length elements are the 16-bit elements of val. If t.length > val.length, then element val.length of the result is a null character. The values of the rest of the array elements are unspecified.
  • If t.elementType is an 8-bit character type:
  • If t is not well-formed UTF-16, fail.
  • Let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8.
  • Let n = the number of bytes in s.
  • If t.length < n, fail.
  • The result is an array of type t whose first n elements are the 8-bit values in s. If t.length > n, then element n of the result is 0. The values of the rest of the array elements are unspecified.
  • Otherwise fail.
  • If val is a JavaScript array object:
  • If val.length is not a nonnegative integer, fail.
  • If val.length !== t.length, fail.
  • Otherwise, the result is a C/C++ array of val.length elements of type t.elementType. Element i of the result is ImplicitConvert(val[i], t.elementType).
  • Otherwise fail. (Rationale: The clause "If val is a JavaScript array object" requires some justification. If we allowed arbitrary JavaScript objects that resemble arrays, that would include CData objects of array type. Consequently, arr1.value = arr2 where arr1 is of type ctypes.uint8_t.array(30) and arr2 is of type ctypes.int.array(30) would work as long as the values in arr2 are small enough. We considered this conversion too astonishing and too error-prone.)
  • Otherwise t is a struct type.
  • If val is a JavaScript object that is not a CData object:
  • If the enumerable own properties of val are exactly the names of the members of the struct t, the result is a C/C++ struct of type t, each of whose members is ImplicitConvert(val[the member name], the type of the member).
  • Otherwise fail.
  • Otherwise fail.

ExplicitConvert(val, t) - Convert the JavaScript value val to a C/C++ value of type t, a little more forcefully than ImplicitConvert.

This is called when a JavaScript value is passed as a parameter when calling a type, as in t(val) or new t(val).

  • If ImplicitConvert(val, t) succeeds, use that result. Otherwise:
  • If t is ctypes.bool, the result is the C/C++ boolean value corresponding to ToBoolean(val), where the operator ToBoolean is as defined in the ECMAScript standard. (This is a bit less strict than the conversion behavior specified for numeric types below. This is just for convenience: the operators && and ||, which produce a boolean value in C/C++, do not always do so in JavaScript.)
  • If t is an integer or character type and val is an infinity or NaN, the result is a 0 of type t.
  • If t is an integer or character type and val is a finite number, the result is the same as casting the jsdouble value of val to type t with a C-style cast. (I think this basically means, start with val, discard the fractional part, convert the integer part to a bit-pattern, and mask off whatever doesn't fit in type t. But whatever C does is good enough for me. --jorendorff)
  • If t is an integer or character type and val is an Int64 or UInt64 object, the result is the same as casting the int64_t or uint64_t value of val to type t with a C-style cast.
  • If t is a pointer type and val is a number, Int64 object, or UInt64 object that can be exactly represented as an intptr_t or uintptr_t, the result is the same as casting that intptr_t or uintptr_t value to type t with a C-style cast.
  • If t is an integer type (not a character type) and val is a string consisting entirely of an optional minus sign, followed by either one or more decimal digits or the characters "0x" or "0X" and one or more hexadecimal digits, then the result is the same as casting the integer named by val to type t with a C-style cast.
  • Otherwise fail.

SameType(t, u) - True if t and u represent the same C/C++ type.

  • If t and u represent the same built-in type, even void, return true.
  • If they are both pointer types, return SameType(t.targetType, u.targetType).
  • If they are both array types, return SameType(t.elementType, u.elementType) && t.length === u.length.
  • If they are both struct types, return t === u.
  • Otherwise return false.

(SameType(int, int32_t) is false. Rationale: As it stands, SameType behaves the same on all platforms. By making types match if they are typedef'd on the current platform, we could make e.g. ctypes.int.ptr and ctypes.int32_t.ptr compatible on platforms where we just have typedef int int32_t. But it was unclear how much that would matter in practice, balanced against cross-platform consistency. We might reverse this decision.)

Examples

Cu.import("ctypes"); // imports the global ctypes object

// searches the path and opens "libmylib.so" on linux,
// "libmylib.dylib" on mac, and "mylib.dll" on windows
let mylib = ctypes.open("mylib", ctypes.SEARCH);

// declares the C function:
//     int32_t myfunc(int32_t);
let myfunc = mylib.declare("myfunc", ctypes.default_abi,
    ctypes.int32_t, ctypes.int32_t);

let ret = myfunc(2); // calls myfunc

Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a ctypes.int32_t object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.

Here is how to create an object of type int32_t:

let i = new ctypes.int32_t; // new int32_t object with default value 0

This allocates a new C++ object of type int32_t (4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.

Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you:

let myfunc = mylib.declare("myfunc", ctypes.default_abi,
    ctypes.int32_t, ctypes.int32_t);
let ret = myfunc(i);
print(typeof ret); // The result is a JavaScript number.
number

ctypes.int32_t is a CType. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined CTypes such as structs and pointers also - see later.)

The object created by new ctypes.int32_t is called a CData object, and they are described in detail in the "CData objects" section above.

Opaque pointers:

// A new opaque pointer type.
FILE_ptr = new ctypes.StructType("FILE").ptr;

let fopen = mylib.declare("fopen", ctypes.default_abi,
    FILE_ptr, ctypes.char.ptr, ctypes.char.ptr);
let file = fopen("foo", "r");
if (file.isNull())
    throw "fopen failed";
file.contents(); // TypeError: type is unknown

(Open issue: fopen("foo", "r") does not work under js-ctypes as currently specified.)

Declaring a struct:

// C prototype: struct s_t { int32_t a; int64_t b; };
const s_t = new ctypes.StructType("s_t", [{ a: Int32 }, { b: Int64 }]);
let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, s_t);

let s = new s_t(10, 20);

This creates an s_t object which allocates enough memory for the whole struct, creates getters and setters to access the binary fields via their offset, and assigns the values 10 and 20 to the fields. The new object's prototype is s_t.prototype.

let i = myfunc(0, s); // checks the type of s

Nested structs:

const u_t = ctypes.StructType("u_t", [{ x: Int64 }, { y: s_t }]);
let u = new u_t(5e4, s); // copies data from s into u.y - no references

let u_field = u.y; // creates an s_t object that points directly to
                   // the offset of u.y within u.

An out parameter:

// allocate sizeof(uint32_t)==4 bytes,
// initialize to 5, and return a new CData object
let i = new ctypes.uint32_t(5);

// Declare a C function with an out parameter.
const getint = ctypes.declare("getint", ctypes.abi.default,
    ctypes.void_t, ctypes.uint32_t.ptr);

getint(i.address()); // explicitly take the address of allocated buffer

(Python ctypes has byref(i) as an alternative to i.address(), but we do not expect users to do the equivalent of from ctypes import *, and setint(ctypes.byref(i)) is a bit much.)

Pointers:

// Declare a C function that returns a pointer.
const getintp = ctypes.declare("getintp", ctypes.abi.default,
    ctypes.uint32_t.ptr);
let p = getintp(); // A CData object that holds the returned uint32_t *

// cast from (uint32_t *) to (uint8_t *)
let q = ctypes.cast(p, ctypes.uint8_t.ptr);

// first byte of buffer
let b0 = q.contents(); // an integer, 0 <= b0 < 256

Struct fields:

const u_t = new ctypes.StructType('u_t',
    [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]);
// allocates sizeof(2*uint32_t) and creates a CData object
let u = new u_t(5, 10);
u.x = 7; // setter for u.x modifies field
let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y)
print(i);    // ...which is the primitive number 10
10

i = 5; // doesn't touch u.y
print(u.y);
10

const v_t = new ctypes.StructType('v_t',
    [[u_t, 'u'], [ctypes.uint32_t, 'z']]);
// allocates 12 bytes, zeroes them out, and creates a CData object
let v = new v_t;
let w = v.u; // ConvertToJS(reference to v.u) returns CData object
w.x = 3; // invokes setter
setint(v.u.x); // TypeError: setint argument 1 expects type uint32_t *, got int
let p = v.u.addressOfField('x'); // pointer to v.u.x
setint(p); // ok - manually pass address

64-bit integers:

// Declare a function that returns a 64-bit unsigned int.
const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
    ctypes.uint64_t, ctypes.char.ptr);

// This autoconverts to a UInt64 object, not a JS number, even though the
// file is presumably much smaller than 4GiB. Converting to a different type
// each time you call the function, depending on the result value, would be
// worse.
let s = getfilesize("/usr/share/dict/words");
print(s instanceof ctypes.UInt64);
true
print(s < 1000000);    // Because s is an object, not a number,
false            // JS lies to you.
print(s >= 1000000);   // Neither of these is doing what you want,
false            // as evidenced by the bizarre answers.
print(s);              // It has a nice .toString() method at least!
931467

// There is no shortcut. To get an actual JS number out of a
// 64-bit integer, you have to use the ctypes.{Int64,UInt64}.{hi,lo}
// functions.
print(ctypes.UInt64.lo(s))
931467
// (OK, I lied. There is a shortcut. You can abuse the .toString() method.
// WARNING: This can lose precision!)
print(Number(s.toString()))
931467

let i = new ctypes.int64_t(5);  // a new 8-byte buffer
let j = i;  // another variable referring to the same CData object
j.value = 6; // invokes setter on i, auto-promotes 6 to Int64
print(typeof j.value)  // but j.value is still an Int64 object
object
print(j.value instanceof ctypes.Int64)
true
print(j.value);
6

const m_t = new ctypes.StructType(
    'm_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]);
let m = new m_t;
const getint64 = ctypes.declare("getint64", ctypes.abi.default,
    ctypes.void_t, ctypes.Pointer(ctypes.int64_t));
getint64(m.x); // TypeError: getint64 argument 1 expected type int64_t *,
               // got Int64 object
               // (because m.x's getter autoconverts to an Int64 object)
getint64(ctypes.addressOfField(m, 'x')); // works

(Open issue: As above, the implicit conversion from JS string to char * in getfilesize("/usr/share/dict/words") does not work in js-ctypes as specified.)

(TODO - make this a real example:)

let i1 = ctypes.int32_t(5);
let i2 = ctypes.int32_t();
i2.value = i1  // i2 and i1 have separate binary storage, this is memcpy
//you can copy the guts of one struct to another, etc.

Future directions

Callbacks

The libffi part of this is presumably not too bad. Issues:

Lifetimes. C/C++ makes it impossible to track an object pointer. Both JavaScript's GC and experience with C/C++ function pointers will tend to discourage users from caring about function lifetimes.

I think the best solution to this problem is to put the burden of keeping the function alive entirely on the client.

Finding the right context to use. If we burn the cx right into the libffi closure, it will crash when called from a different thread or after the cx is destroyed. If we take a context at random from some internal JSAPI structure, it might be thread-safe, but the context's options and global will be random, which sounds dangerous. Perhaps ctypes itself can create a context per thread, on demand, for the use of function pointers. In a typical application, that would only create one context, if any.

Converting strings

I think we want an explicit API for converting strings, very roughly:

CData objects of certain pointer and array types have methods for reading and writing Unicode strings. These methods are present if the target or element type is an 8-bit character or integer type.

cdata.readString([encoding[, length]]) - Read bytes from cdata and convert them to Unicode characters using the specified encoding, returning a string. Specifically:

  • If cdata is an array, let p = a pointer to the first element. Otherwise cdata is a pointer; let p = the value of cdata.
  • If encoding is undefined or omitted, the selected encoding is UTF-8. Otherwise, if encoding is a string naming a known character encoding, that encoding is selected. Otherwise throw a TypeError.
  • If length is a size value, cdata is an array, and length > cdata.length, then throw a TypeError.
  • Otherwise, if length is a size value, take exactly length bytes starting at p and convert them to Unicode characters according to the selected encoding. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (The result may contain null characters.)
  • Otherwise, if length is undefined or omitted, convert bytes starting at p to Unicode characters according to the selected encoding. Stop when the end of the array is reached (if cdata is an array) or when a null character (U+0000) is found. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (If cdata is a pointer and there is no trailing null character, this can crash.)
  • Otherwise throw a TypeError.

cdata.writeString(s, [encoding[, length]]) - Determine the starting pointer p as above. If s is not a well-formed UTF-16 string, throw a TypeError. (Open issue: Error handling.) Otherwise convert s to bytes in the specified encoding (default: UTF-8) and write at most length - 1 bytes, or all the converted bytes, if length is undefined or omitted, to memory starting at p. Write a converted null character after the data. Return the number of bytes of data written, not counting the terminating null character.

(Open issue: cdata.writeString(...) is awkward for the case where you want an autosized ctypes.char.array() to hold the converted data. If cdata happens to be too small for the resulting string, and you don't supply length, you crash; and if you do supply length, you don't know whether conversion was halted because the target array was of insufficient length.)

(Open issue: As proposed, these are not suitable for working with encodings where a zero byte might not indicate the end of text. For example, a string encoded in UTF-16 will typically contain a lot of zero bytes. Unfortunately, in the case of readString, the underlying library demands the length up front.)

(Open issue: These methods offer no error handling options, which is pretty weak. Real-world code often wants to allow a few characters to be garbled rather than fail. For now we will likely be limited to whatever the underlying codec library, nsIScriptableUnicodeConverter, can do.)

(Open issue: 16-bit versions too, for UTF-16?)

isNull

If we do not convert NULL pointers to JS null (and I may have changed my mind about this) then we need:

cptr.isNull() - Return true if cptr's value is a null pointer, false otherwise.

Auto-converting strings

There are several issues:

Lifetimes. This problem arises when autoconverting from JS to C/C++ only.

When passing a string to a foreign function, like foo(s), what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing s for the duration of the call. But then there are situations like

TenStrings = char.ptr.array(10);
var arr = new TenStrings();
arr[0] = s;  // What is the lifetime of the data arr[0] points to?

The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.

Non-null-terminated strings. This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).

In C/C++, the type char * effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing char * pointers that aren't logically strings). The workaround would be to declare them as a different type.

Unicode. This problem does not apply to conversions between JS strings and char16_t arrays or pointers; only char arrays or pointers.

Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and char strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings and, for those, char16_t is the right thing.

Casting away const. This problem arises only when converting from a JS string to a C/C++ pointer type. The string data must not be modified, but the C/C++ types char * and char16_t * suggest that the referent might be modified.