Jsctypes/api: Difference between revisions

10,220 bytes added ,  30 September 2014
Bug 1063962 renamed ctypes.jschar to ctypes.char16_t and, to maintain backwards compatibility, bug 1064935 added an alias for ctypes.jschar to ctypes.char16_t.
(make builtin types int and int32_t distinct on all platforms, and likewise short and int16_t etc.; also, SameType fixes)
(Bug 1063962 renamed ctypes.jschar to ctypes.char16_t and, to maintain backwards compatibility, bug 1064935 added an alias for ctypes.jschar to ctypes.char16_t.)
 
(47 intermediate revisions by 4 users not shown)
Line 9: Line 9:
<code>Library</code> objects have the following methods:
<code>Library</code> objects have the following methods:


:'''<code>''lib''.declare(''name'', ''abi'', ''rtype'', ''<nowiki>[argtype1, ...]</nowiki>'')</code>''' - Declare a function. ''(TODO: all the details)'' This always returns a new function object or throws an exception.
:'''<code>''lib''.declare(''name'', ''abi'', ''rtype'', ''<nowiki>[argtype1, ...]</nowiki>'')</code>''' - Declare a function. ''(TODO: all the details)'' This always returns a new callable <code>CData</code> object representing a function pointer to ''name'', or throws an exception.


:If ''rtype'' is an array type, this throws a <code>TypeError</code>.
:If ''rtype'' is an array type, this throws a <code>TypeError</code>.


:If any ''argtypeN'' is an array type, the result is the same as if it had been the corresponding pointer type, <code>''argtypeN''.elementType.ptr</code>.  ''(Rationale: This is how C and C++ treat array types in function declarations.)''
:If any ''argtypeN'' is an array type, the result is the same as if it had been the corresponding pointer type, <code>''argtypeN''.elementType.ptr</code>.  ''(Rationale: This is how C and C++ treat array types in function declarations.)''
''(TODO: Explain what happens when you call a declared function. In brief: It uses <code>ImplicitConvert</code> to convert the JavaScript arguments to C and <code>ConvertToJS</code> to convert the return value to JS.)''


= Types =
= Types =
Line 29: Line 31:
:Since some 64-bit values are outside the range of the JavaScript number type, <code>ctypes.int64_t</code> and <code>ctypes.uint64_t</code> do not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper types <code>ctypes.Int64</code> and <code>ctypes.UInt64</code> (which are JavaScript object types, not <code>CType</code>s). See "64-bit integer objects" below.
:Since some 64-bit values are outside the range of the JavaScript number type, <code>ctypes.int64_t</code> and <code>ctypes.uint64_t</code> do not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper types <code>ctypes.Int64</code> and <code>ctypes.UInt64</code> (which are JavaScript object types, not <code>CType</code>s). See "64-bit integer objects" below.


:'''<code>ctypes.size_t, ssize_t, intptr_t, uintptr_t</code>''' - Primitive types whose size depends on the platform. These types do not autoconvert to JavaScript numbers because on some platforms, there are values of these types that cannot be precisely represented as a JS number. Instead they convert to wrapper objects (on all platforms). See "64-bit integer objects" below.
:'''<code>ctypes.size_t, ssize_t, intptr_t, uintptr_t</code>''' - Primitive types whose size depends on the platform. ''(These types do not autoconvert to JavaScript numbers. Instead they convert to wrapper objects, even on 32-bit platforms. See "64-bit integer objects" below. Rationale: On 64-bit platforms, there are values of these types that cannot be precisely represented as JS numbers. It will be easier to write code that works on multiple platforms if the builtin types autoconvert in the same way on all platforms.)''


:'''<code>ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double</code>''' - Types that behave like the corresponding C types. As in C, <code>unsigned</code> is always an alias for <code>unsigned_int</code>.
:'''<code>ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double</code>''' - Types that behave like the corresponding C types. As in C, <code>unsigned</code> is always an alias for <code>unsigned_int</code>.


:<code>ctypes.long</code> and <code>ctypes.unsigned_long</code> autoconvert to 64-bit integer objects (on all platforms). ''(Rationale: Some platforms have 64-bit <code>long</code> and some do not.)'' The rest autoconvert to JavaScript numbers.
:''(<code>ctypes.long</code> and <code>ctypes.unsigned_long</code> autoconvert to 64-bit integer objects on all platforms. The rest autoconvert to JavaScript numbers. Rationale: Some platforms have 64-bit <code>long</code> and some do not.)''
 
:'''<code>ctypes.char, ctypes.signed_char, ctypes.unsigned_char</code>''' - Character types that behave like the corresponding C types. (These are distinct from <code>int8_t</code> and <code>uint8_t</code> in details of conversion behavior. For example, js-ctypes autoconverts between C characters and one-character JavaScript strings.)


:'''<code>ctypes.jschar</code>''' - A 16-bit unsigned character type. (This is distinct from <code>uint8_t</code> in details of conversion behavior. js-ctypes autoconverts C <code>jschar</code>s to JavaScript strings of length 1.)
:'''<code>ctypes.char, ctypes.signed_char, ctypes.unsigned_char</code>''' - Character types that behave like the corresponding C types. (These are very much like <code>int8_t</code> and <code>uint8_t</code>, but they differ in some details of conversion. For example, <code>ctypes.char.array(30)(str)</code> converts the string ''str'' to UTF-8 and returns a new <code>CData</code> object of array type.)


:'''<code>ctypes.string, ustring</code>''' - String types. The C/C++ type for <code>ctypes.string</code> is <code>const char *</code>. C/C++ values of this type must be either <code>null</code> or pointers to null-terminated strings. <code>ctypes.ustring</code> is the same, but for <code>const jschar *</code>; that is, the code units of the string are <code>uint16_t</code>.
:'''<code>ctypes.char16_t</code>''' - A 16-bit unsigned character type representing a UTF-16 code unit. (This is distinct from <code>uint16_t</code> in details of conversion behavior. js-ctypes autoconverts C <code>char16_t</code>s to JavaScript strings of length 1.) For backwards compatibility, <code>ctypes.jschar</code> is an alias for <code>char16_t</code>.


:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type.  (<code>void</code> is a keyword in JavaScript.)
:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type.  (<code>void</code> is a keyword in JavaScript.)


:'''<code>ctypes.voidptr_t</code>''' - The C type <code>void *</code>.
:'''<code>ctypes.voidptr_t</code>''' - The C type <code>void *</code>.
The ''wrapped integer types'' are the types <code>int64_t</code>, <code>uint64_t</code>, <code>size_t</code>, <code>ssize_t</code>, <code>intptr_t</code>, <code>uintptr_t</code>, <code>long</code>, and <code>unsigned_long</code>. These are the types that autoconvert to 64-bit integer objects rather than to primitive JavaScript numbers.


== User-defined types ==
== User-defined types ==
Line 50: Line 52:


:'''<code>new ctypes.PointerType(''t'')</code>''' - If ''t'' is a <code>CType</code>, return the type "pointer to ''t''". The result is cached so that future requests for this pointer type produce the same <code>CType</code> object. If ''t'' is a string, instead return a new opaque pointer type named ''t''. Otherwise throw a <code>TypeError</code>.
:'''<code>new ctypes.PointerType(''t'')</code>''' - If ''t'' is a <code>CType</code>, return the type "pointer to ''t''". The result is cached so that future requests for this pointer type produce the same <code>CType</code> object. If ''t'' is a string, instead return a new opaque pointer type named ''t''. Otherwise throw a <code>TypeError</code>.
:'''<code>new ctypes.FunctionType(''abi'', ''rt'', [ ''at1'', ... ])</code>''' - Return a function pointer <code>CType</code> corresponding to the C type <code>rt (*) (at1, ...)</code>, where ''abi'' is a ctypes ABI type and ''rt'' and ''at1'', ... are <code>CType</code>s. Otherwise throw a <code>TypeError</code>.


:'''<code>new ctypes.ArrayType(''t'')</code>''' - Return an array type with unspecified length and element type ''t''.  If ''t'' is not a type or <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>.
:'''<code>new ctypes.ArrayType(''t'')</code>''' - Return an array type with unspecified length and element type ''t''.  If ''t'' is not a type or <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>.
Line 59: Line 63:
:''(Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)''
:''(Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)''


:'''<code>new ctypes.StructType(''name'', ''fields'')</code>''' - Create a new struct type with the given ''name'' and ''fields''. ''fields'' is an array of field descriptors. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If ''name'' is not a string, or ''fields'' contains a field descriptor with a type ''t'' such that <code>''t''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>. If the size of the struct, in bytes, would not be exactly representable both as a <code>size_t</code> and as a JavaScript number, throw a <code>RangeError</code>.
:'''<code>new ctypes.StructType(''name'', ''fields'')</code>''' - Create a new struct type with the given ''name'' and ''fields''. ''fields'' is an array of field descriptors, of the format
 
:<code>[ { field1: type1 }, { field2: type2 }, ... ]</code>
 
:where <code>field''n''</code> is a string denoting the name of the field, and <code>type''n''</code> is a ctypes type. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If ''name'' is not a string, or any <code>type''n''</code> is such that <code>type''n''.size</code> is <code>undefined</code>, throw a <code>TypeError</code>. If the size of the struct, in bytes, would not be exactly representable both as a <code>size_t</code> and as a JavaScript number, throw a <code>RangeError</code>.


''(Open issue: Specify a way to tell <code>ctypes.StructType</code> to use <code>#pragma pack(n)</code>.)''
''(Open issue: Specify a way to tell <code>ctypes.StructType</code> to use <code>#pragma pack(n)</code>.)''
''(TODO: Finish specifying field descriptors.)''


These constructors behave exactly the same way when called without the <code>new</code> keyword.
These constructors behave exactly the same way when called without the <code>new</code> keyword.
Line 72: Line 78:
  const HANDLE = new ctypes.PointerType("HANDLE");
  const HANDLE = new ctypes.PointerType("HANDLE");
  const HANDLES = new ctypes.ArrayType(HANDLE);
  const HANDLES = new ctypes.ArrayType(HANDLE);
  const FILE = new ctypes.PointerType("FILE *");
  const FILE = new ctypes.StructType("FILE").ptr;
  const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
  const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096);
   
   
  const struct_tm = new ctypes.StructType('tm', [[ctypes.int, 'tm_sec'], ...]);
  const struct_tm = new ctypes.StructType('tm', [{'tm_sec': ctypes.int}, ...]);
const comparator_t = new ctypes.FunctionType(ctypes.default_abi, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);


== Properties of types ==
== Properties of types ==
Line 91: Line 99:
:'''<code>''t''.name</code>''' - A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
:'''<code>''t''.name</code>''' - A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.


:For primitive types this is just the name of the corresponding C/C++ type. Note that some of the builtin types are aliases for other types, so it might be that <code>ctypes.unsigned_long.name == "uint32_t"</code> (or something else). ''(Open issue: Is that too astonishing? Python ctypes does the same thing.)''
:For primitive types this is just the name of the corresponding C/C++ type.


:For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other pointer types and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
:For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other function, pointer, and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.


:''(Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)''
:''(Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)''
Line 101: Line 109:
  ctypes.void_t.name
  ctypes.void_t.name
   ===> "void"
   ===> "void"
  ctypes.ustring.name
  ctypes.char16_t.ptr.name
   ===> "const jschar *"
   ===> "char16_t *"
   
   
  const FILE = new ctypes.PointerType("FILE *");
  const FILE = new ctypes.StructType("FILE").ptr;
  FILE.name
  FILE.name
   ===> "FILE *"
   ===> "FILE*"
   
   
  const struct_tm = new ctypes.StructType("tm", {tm_sec: ctypes.int, ...});
const fn_t = new ctypes.FunctionType(ctypes.stdcall, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);
fn_t.name
  ===> "int (__stdcall *)(void*, void*)"
  const struct_tm = new ctypes.StructType("tm", [{tm_sec: ctypes.int}, ...]);
  struct_tm.name
  struct_tm.name
   ===> "tm"
   ===> "tm"
Line 117: Line 129:
     new ctypes.PointerType(
     new ctypes.PointerType(
       new ctypes.PointerType(
       new ctypes.PointerType(
         new ctypes.ArrayType(ctypes.string, 4)));
         new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4)));
  ptrTo_ptrTo_arrayOf4_strings.name
  ptrTo_ptrTo_arrayOf4_strings.name
   ===> "const char *(**)[4]"
   ===> "char *(**)[4]"


:'''<code>''t''.ptr</code>''' - Return <code>ctypes.PointerType(''t'')</code>.
:'''<code>''t''.ptr</code>''' - Return <code>ctypes.PointerType(''t'')</code>.
Line 129: Line 141:
:Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
:Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:


  const ptrTo_ptrsTo_arrayOf4_strings = ctypes.string.array(4).ptr.ptr;
  const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;


:''(<code>.array()</code> requires parentheses but <code>.ptr</code> doesn't. Rationale: <code>.array()</code> has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:  <code>int [10]</code> --> <code>ctypes.int.array(10)</code>. Writing a pointer type does not require the brackets.)''
:''(<code>.array()</code> requires parentheses but <code>.ptr</code> doesn't. Rationale: <code>.array()</code> has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:  <code>int [10]</code> --> <code>ctypes.int.array(10)</code>. Writing a pointer type does not require the brackets.)''
Line 144: Line 156:
  const charPtr = new ctypes.PointerType(ctypes.char);
  const charPtr = new ctypes.PointerType(ctypes.char);
  charPtr.toSource()
  charPtr.toSource()
   ===> "ctypes.PointerType(ctypes.char)"
   ===> "ctypes.char.ptr"
   
   
  const Point = new ctypes.StructType(
  const Point = new ctypes.StructType(
     "Point", {x: ctypes.int32_t, y: ctypes.int32_t});
     "Point", [{x: ctypes.int32_t}, {y: ctypes.int32_t}]);
  Point.toSource()
  Point.toSource()
   ===> "ctypes.StructType("Point", {x: ctypes.int32_t, y: ctypes.int23_t})"
   ===> "ctypes.StructType("Point", [{x: ctypes.int32_t}, {y: ctypes.int23_t}])"


Pointer types also have:
Pointer types also have:


:'''<code>''t''.targetType</code>''' - Read-only. The pointed-to type, or <code>null</code> if ''t'' is an opaque pointer type.
:'''<code>''t''.targetType</code>''' - Read-only. The pointed-to type, or <code>null</code> if ''t'' is an opaque pointer type.
Function types also have:
:'''<code>''t''.abi</code>''' - Read-only. The ABI of the function; one of the ctypes ABI objects.
:'''<code>''t''.returnType</code>''' - Read-only. The return type.
:'''<code>''t''.argTypes</code>''' - Read-only. A sealed array of argument types.


Struct types also have:
Struct types also have:
Line 175: Line 195:
:Every <code>CType</code> has a read-only, permanent <code>.prototype</code> property.  The type-constructors <code>ctypes.{C,Pointer,Struct,Array}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.
:Every <code>CType</code> has a read-only, permanent <code>.prototype</code> property.  The type-constructors <code>ctypes.{C,Pointer,Struct,Array}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.


:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.CType.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer}Type.prototype</code> and of all the builtin types except for the string types and <code>ctypes.voidptr_t</code> is <code>ctypes.CType.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a string type or pointer type is <code>ctypes.PointerType.prototype</code>.
:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.CType.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer,Function}Type.prototype</code> and of all the builtin types except <code>ctypes.voidptr_t</code> is <code>ctypes.CType.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a pointer type is <code>ctypes.PointerType.prototype</code>. The prototype of a function type is <code>ctypes.FunctionType.prototype</code>.


:Every <code>CType</code> ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. The same is true of the four type constructors <code>ctypes.{C,Array,Struct,Pointer}Type</code>.
:Every <code>CType</code> ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. The same is true of the five type constructors <code>ctypes.{C,Array,Struct,Pointer,Function}Type</code>.


== Calling types ==
== Calling types ==
Line 189: Line 209:
:If <code>''t''.size</code> is <code>undefined</code>, this throws a <code>TypeError</code>.
:If <code>''t''.size</code> is <code>undefined</code>, this throws a <code>TypeError</code>.


:'''<code>new ''t''(''val'')</code>''' or '''<code>''t''(''val'')</code>''' - If <code>''t''.size</code> is not <code>undefined</code>: Convert ''val'' to type ''t'' by calling <code>ExplicitConvert(''val'', ''t'')</code>, throwing a <code>TypeError</code> if the conversion is impossible. Allocate a new buffer of <code>''t''.size</code> bytes, populated with the converted value. Return a new <code>CData</code> object of type ''t'' referring to the complete object in that buffer. (When ''val'' is a <code>CData</code> object of type ''t'', the behavior is like <code>malloc</code> followed by <code>memcpy</code>.)
:'''<code>new ''t''(''val'')</code>''' or '''<code>''t''(''val'')</code>''' - Create a new <code>CData</code> object as follows:
 
:* If <code>''t''.size</code> is not <code>undefined</code>: Convert ''val'' to type ''t'' by calling <code>ExplicitConvert(''val'', ''t'')</code>, throwing a <code>TypeError</code> if the conversion is impossible. Allocate a new buffer of <code>''t''.size</code> bytes, populated with the converted value. Return a new <code>CData</code> object of type ''t'' referring to the complete object in that buffer. (When ''val'' is a <code>CData</code> object of type ''t'', the behavior is like <code>malloc</code> followed by <code>memcpy</code>.)
 
:* If ''t'' is an array type of unspecified length:
 
::* If ''val'' is a size value (defined above): Let ''u'' = <code>ArrayType(''t''.elementType, ''val'')</code> and return <code>new ''u''</code>.
 
::* If <code>''t''.elementType</code> is <code>char16_t</code> and ''val'' is a string: Return a new <code>CData</code> object of type <code>ArrayType(ctypes.char16_t, ''val''.length&nbsp;+&nbsp;1)</code> containing the contents of ''val'' followed by a null character.


:If ''t'' is an array type of unspecified length and ''val'' is a size value (defined above): Let ''u'' = <code>ArrayType(''t''.elementType, ''val'')</code> and return <code>new ''u''</code>.
::* If <code>''t''.elementType</code> is an 8-bit character type and ''val'' is a string: If ''val'' is not a well-formed UTF-16 string, throw a <code>TypeError</code>. Otherwise, let ''s'' = a sequence of bytes, the result of converting ''val'' from UTF-16 to UTF-8, and let ''n'' = the number of bytes in ''s''. Return a new <code>CData</code> object of type <code>ArrayType(''t''.elementType, ''n'' + 1)</code> containing the bytes in ''s'' followed by a null character.


:If ''t'' is an array type of unspecified length and ''val'' is not a size value: If ''val'' is not an object, or it does not have a <code>.length</code> property whose value is a nonnegative integer, throw a <code>TypeError</code>. Otherwise, let ''u'' = <code>ArrayType(''t''.elementType, ''val''.length)</code> and return <code>new ''u''(''val'')</code>.
::* If ''val'' is a JavaScript array object and <code>''val''.length</code> is a nonnegative integer, let ''u'' = <code>ArrayType(''t''.elementType, ''val''.length)</code> and return <code>new ''u''(''val'')</code>. ''(Array <code>CData</code> objects created in this way have <code>''cdata''.constructor === ''u''</code>, not ''t''. Rationale: For all <code>CData</code> objects, <code>cdata.constructor.size</code> gives the size in bytes, unless a struct field shadows <code>cdata.constructor</code>.)''


:''(Array <code>CData</code> objects created in this way have <code>''cdata''.constructor === ''u''</code>, not ''t''. Rationale: For all <code>CData</code> objects, <code>cdata.constructor.size</code> gives the size in bytes, unless a struct field shadows <code>cdata.constructor</code>.)''
::* Otherwise, throw a <code>TypeError</code>.


:Otherwise, ''t'' is <code>void_t</code>. Throw a <code>TypeError</code>.
:* Otherwise, ''t'' is <code>void_t</code>. Throw a <code>TypeError</code>.


  let a_t = ctypes.ArrayType(ctypes.int32_t);
  let a_t = ctypes.ArrayType(ctypes.int32_t);
Line 253: Line 281:


These getters and setters can shadow the properties and methods described above.
These getters and setters can shadow the properties and methods described above.
== Pointers ==
<code>CData</code> objects of pointer types also have this property:
:'''<code>''cptr''.''contents''</code>''' - Let ''C'' be a <code>CData</code> object referring to the pointed-to contents of ''cptr''. Return <code>ConvertToJS(''C'')</code>.
:'''<code>''cptr''.''contents'' = ''val''</code>''' - Let ''cval'' = <code>ImplicitConvert(''val'', the base type of the pointer)</code>. If conversion fails, throw a <code>TypeError</code>. Otherwise store ''cval'' in the pointed-to contents of ''cptr''.
== Functions ==
<code>CData</code> objects of function types are callable:
:'''<code>''let result = cfn(arg''1'', ...)''</code>''' - Let ''(carg''1'', ...)'' be <code>CData</code> objects representing the arguments to the C function ''cfn'', and ''cresult'' be a <code>CData</code> object representing its return value. Let ''carg''n = <code>ImplicitConvert(''arg''n, the type of the argument)</code>, and let ''result'' = <code>ConvertToJS(''cresult'')</code>. Call the C function with arguments represented by ''(carg''1'', ...)'', and store the result in ''cresult''. If conversion fails, throw a <code>TypeError</code>.


== Arrays ==
== Arrays ==
Line 267: Line 309:


:'''<code>''carray''.addressOfElement(''i'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type (<code>ctypes.PointerType(''carray''.constructor.elementType)</code>) whose value points to element ''i'' of ''carray''. If ''i'' is not a JavaScript number that is a valid index of ''carray'', throw a <code>TypeError</code>.
:'''<code>''carray''.addressOfElement(''i'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type (<code>ctypes.PointerType(''carray''.constructor.elementType)</code>) whose value points to element ''i'' of ''carray''. If ''i'' is not a JavaScript number that is a valid index of ''carray'', throw a <code>TypeError</code>.
''(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)''


== Aliasing ==
== Aliasing ==
Line 302: Line 346:
== Casting ==
== Casting ==


:'''<code>ctypes.cast(''cdata'', ''t'')</code>''' - Return a new <code>CData</code> object which points to the same memory block as ''cdata'', but with type ''t''. If <code>''t''.size</code> is undefined or less than <code>''cdata''.constructor.size</code>, throw a <code>TypeError</code>. This is like a C cast or a C++ <code>reinterpret_cast</code>.
:'''<code>ctypes.cast(''cdata'', ''t'')</code>''' - Return a new <code>CData</code> object which points to the same memory block as ''cdata'', but with type ''t''. If <code>''t''.size</code> is undefined or larger than <code>''cdata''.constructor.size</code>, throw a <code>TypeError</code>. This is like a C cast or a C++ <code>reinterpret_cast</code>.


== Equality ==
== Equality ==
Line 339: Line 383:
== Int64 ==
== Int64 ==


:'''<code>ctypes.Int64(''n'')</code>''' or '''<code>new ctypes.Int64(''n'')</code>''' - If ''n'' is an integer-valued number such that -2<sup>63</sup> &le; ''n'' &lt; 2<sup>63</sup>, return a sealed <code>Int64</code> object with that value. Otherwise throw a <code>TypeError</code>.
:'''<code>ctypes.Int64(''n'')</code>''' or '''<code>new ctypes.Int64(''n'')</code>''' - If ''n'' is an integer-valued number such that -2<sup>63</sup> &le; ''n'' &lt; 2<sup>63</sup>, return a sealed <code>Int64</code> object with that value. Otherwise if ''n'' is a string consisting of an optional minus sign followed by either decimal digits or <code>"0x"</code> or <code>"0X"</code> and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct an <code>Int64</code> object as above. Otherwise if ''n'' is an <code>Int64</code> or <code>UInt64</code> object, and represents a number within range, use the value to construct an <code>Int64</code> object as above. Otherwise throw a <code>TypeError</code>.


<code>Int64</code> objects have the following methods:
<code>Int64</code> objects have the following methods:
Line 365: Line 409:
These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.
These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.


'''<code>ConvertToJS(''x'')</code>''' - This function is used to convert a <code>CData</code> object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible, and a <code>CData</code> object otherwise. The precise rules are:
'''<code>ConvertToJS(''x'')</code>''' - This function is used to convert a <code>CData</code> object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible without loss of data or different behavior on different platforms, and a <code>CData</code> object otherwise. The precise rules are:


* If the type of ''x'' is <code>void</code>, return <code>undefined</code>.
* If the type of ''x'' is <code>void</code>, return <code>undefined</code>.
Line 371: Line 415:
* If the type of ''x'' is <code>bool</code>, return the corresponding JavaScript boolean.
* If the type of ''x'' is <code>bool</code>, return the corresponding JavaScript boolean.


* If ''x'' is of a number type other than <code>long</code>, <code>unsigned long</code>, the pointer-sized types, and the 64-bit types, return the corresponding JavaScript number.
* If ''x'' is of a number type but not a wrapped integer type, return the corresponding JavaScript number.


* If ''x'' is of type <code>long</code> or a signed pointer-sized or 64-bit numeric type (such as <code>int64_t</code>, <code>ssize_t</code>, or <code>intptr_t</code>), return a <code>ctypes.Int64</code> object with value ''x''.
* If ''x'' is a signed wrapped integer type (<code>long</code>, <code>int64_t</code>, <code>ssize_t</code>, or <code>intptr_t</code>), return a <code>ctypes.Int64</code> object with value ''x''.


* If ''x'' is of type <code>unsigned long</code> or an unsigned pointer-sized or 64-bit numeric type (such as <code>uint64_t</code>, <code>size_t</code>, or <code>uintptr_t</code>), return a <code>ctypes.UInt64</code> object with value ''x''.
* If ''x'' is an unsigned wrapped integer type (<code>unsigned long</code>, <code>uint64_t</code>, <code>size_t</code>, or <code>uintptr_t</code>), return a <code>ctypes.UInt64</code> object with value ''x''.


* If ''x'' is of type <code>jschar</code>, return a JavaScript string of length 1 containing the value of ''x'' (like <code>String.fromCharCode(x)</code>).
* If ''x'' is of type <code>char16_t</code>, return a JavaScript string of length 1 containing the value of ''x'' (like <code>String.fromCharCode(x)</code>).


* If ''x'' is of any other character type, select the corresponding Unicode character. ''(Open issue: Unicode conversions.)'' Convert the character to UTF-16. Return a JavaScript string containing the UTF-16 code units. (If the character type is 1 or 2 bytes, as it is on all platforms we care about, the result is a one-character JavaScript string.)
* If ''x'' is of any other character type, return the JavaScript number equal to its integer value. (This is sensitive to the signedness of the character type. Also, we assume no character types are so wide that they don't fit into a JavaScript number.)


* If ''x'' is of a string type and is <code>NULL</code>, return <code>null</code>.
* Otherwise ''x'' is of an array, struct, or pointer type. If the argument ''x'' is already a <code>CData</code> object, return it. Otherwise allocate a  buffer containing a copy of the C/C++ value ''x'', and return a <code>CData</code> object of the appropriate type referring to the object in the new buffer.


* If ''x'' is of type <code>cstring</code> and is non-null, transcode it to UTF-16 and return a JavaScript string containing the UTF-16 code units.  ''(Open issue: Unicode conversions.)''
Note that null C/C++ pointers do not convert to the JavaScript <code>null</code> value.  ''(Open issue: Should we? Is there any value in retaining the type of a particular null pointer?)''


* If ''x'' is of type <code>ustring</code> and is non-null, return a JavaScript string containing the same sequence of 16-bit characters.
''(Arrays of characters do not convert to JavaScript strings. Rationale: Suppose <code>x</code> is a <code>CData</code> object of a struct type with a member <code>a</code> of type <code>char[10]</code>. Then <code>x.a[1]</code> should return the character in element 1 of the array, even if <code>x.a[0]</code> is a null character.  Likewise, <code>x.a[0] = '\0';</code> should modify the contents of the array. Both are possible only if <code>x.a</code> is a <code>CData</code> object of array type, not a JavaScript string.)''


* Otherwise ''x'' is of an array, struct, or pointer type. If the argument ''x'' is already a <code>CData</code> object, return it. Otherwise allocate a buffer containing a copy of the C/C++ value ''x'', and return a <code>CData</code> object of the appropriate type referring to the object in the new buffer.
<code>'''ImplicitConvert(''val'', ''t'')'''</code> - Convert the JavaScript value ''val'' to a C/C++ value of type ''t''.  This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to <code>''cdata''.value = ''val''</code>, or assigned to an array element or struct member, as in <code>''carray''[''i''] = ''val''</code> or <code>''cstruct''.''member'' = ''val''</code>.


Note that we do not autoconvert null C/C++ pointers to the JavaScript <code>null</code> value.
This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.  


<code>'''ImplicitConvert(''val'', ''t'')'''</code> - Convert the JavaScript value ''val'' to a C/C++ value of type ''t''. This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to <code>''cdata''.value = ''val''</code>, or assigned to an array element or struct member, as in <code>''carray''[''i''] = ''val''</code> or <code>''cstruct''.''member'' = ''val''</code>. This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.
C/C++ values of all supported types round trip through <code>ConvertToJS</code> and <code>ImplicitConvert</code> without any loss of data. That is, for any C/C++ value ''v'' of type ''t'', <code>ImplicitConvert(ConvertToJS(''v''),&nbsp;''t'')&nbsp;</code> produces a copy of ''v''''(Note that not all JavaScript can round-trip to C/C++ and back in an analogous way. JavaScript primitive numbers can round-trip to <code>double</code> on all current platforms, <code>Int64</code> objects to <code>int64_t</code>, JavaScript booleans to <code>bool</code>, and so on. But some JavaScript values, such as functions, cannot be <code>ImplicitConvert</code>ed to any C/C++ type without loss of data.)''


''t'' must not be <code>void</code> or an array type with unspecified length.  ''(Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared <code>int f(int x[])</code> is <code>int *</code>, not <code>int[]</code>.)''
''t'' must not be <code>void</code> or an array type with unspecified length.  ''(Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared <code>int f(int x[])</code> is <code>int *</code>, not <code>int[]</code>.)''
Line 411: Line 455:
:* Otherwise fail.
:* Otherwise fail.


* If ''t'' is <code>ctypes.jschar</code>:
* If ''t'' is <code>ctypes.char16_t</code>:
:* If ''val'' is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string. <code>''val''.charCodeAt(0)</code>.
:* If ''val'' is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string. <code>''val''.charCodeAt(0)</code>.
:* If ''val'' is a number that can be exactly represented as a value of type <code>jschar</code> (that is, an integer in the range 0 &le; ''val'' &lt; 2<sup>16</sup>), the result is that value.
:* If ''val'' is a number that can be exactly represented as a value of type <code>char16_t</code> (that is, an integer in the range 0 &le; ''val'' &lt; 2<sup>16</sup>), the result is that value.
:* Otherwise fail.
:* Otherwise fail.


* If ''t'' is any other character type:
* If ''t'' is any other character type:
:* If ''val'' is a string:
:* If ''val'' is a string:
::* If the 16-bit elements of ''val'' are not the UTF-16 encoding of a single Unicode character, fail.
::* If the 16-bit elements of ''val'' are not the UTF-16 encoding of a single Unicode character, fail. ''(Open issue: If we support <code>wchar_t</code> we may want to allow unpaired surrogate code points to pass through without error.)''
::* If that Unicode character can be represented by a single character of type ''t'', the result is that character. ''(Open issue: Unicode conversions.)''
::* If that Unicode character can be represented by a single character of type ''t'', the result is that character. ''(Open issue: Unicode conversions.)''
::* Otherwise fail.
::* Otherwise fail.
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.  (This is sensitive to the signedness of ''t''.)
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.  (This is sensitive to the signedness of ''t''.)
:* Otherwise fail.
* If ''t'' is a string type:
:* If ''val'' is <code>null</code>, the result is a C/C++ <code>NULL</code> pointer of type ''t''.
:* If ''val'' is a string and ''t'' is <code>ustring</code>, the result is a pointer to the first character of ''val''. The resulting pointer will remain valid as long as the string ''val'' remains reachable. (Note that if the characters of ''val'' are modified in any way, the consequences can be arbitrarily bad.)
:* If ''val'' is a string and ''t'' is <code>string</code>, convert the string to type ''t'' in an implementation-defined way. The resulting pointer will remain valid as long as the string ''val'' remains reachable. ''(Open issue: Unicode conversions.)''
:* Otherwise fail.
:* Otherwise fail.


Line 434: Line 472:
:* If ''val'' is a <code>CData</code> object of array type ''u'' and either ''t'' is <code>ctypes.voidptr_t</code> or <code>SameType(''t''.targetType, ''u''.elementType)</code>, return a pointer to the first element of the array.
:* If ''val'' is a <code>CData</code> object of array type ''u'' and either ''t'' is <code>ctypes.voidptr_t</code> or <code>SameType(''t''.targetType, ''u''.elementType)</code>, return a pointer to the first element of the array.
:* If ''t'' is <code>ctypes.voidptr_t</code> and ''val'' is a <code>CData</code> object of pointer type, return the value of the C/C++ pointer in ''val'', cast to <code>void *</code>.
:* If ''t'' is <code>ctypes.voidptr_t</code> and ''val'' is a <code>CData</code> object of pointer type, return the value of the C/C++ pointer in ''val'', cast to <code>void *</code>.
:* Otherwise fail.  ''(Rationale: We don't convert strings to pointers yet partly because we're lazy and partly because it would implicitly cast away const. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)''
:* Otherwise fail.  ''(Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)''


* If ''t'' is an array type:
* If ''t'' is an array type:
:* If ''val'' is not a JavaScript object, fail. ''(We could reasonably convert JS strings to arrays of char types, but let's skip it for now.)''
:* If ''val'' is a JavaScript string:
:* If <code>''val''.length</code> is not a nonnegative integer, fail.
::* If <code>''t''.elementType</code> is <code>char16_t</code> and <code>''t''.length &gt;= ''val''.length</code>, the result is an array of type ''t'' whose first <code>''val''.length</code> elements are the 16-bit elements of ''val''. If <code>''t''.length &gt; ''val''.length</code>, then element <code>''val''.length</code> of the result is a null character. The values of the rest of the array elements are unspecified.
:* If <code>''val''.length !== ''t''.length</code>, fail.
::* If <code>''t''.elementType</code> is an 8-bit character type:
:* Otherwise, the result is a C/C++ array of <code>''val''.length</code> elements of type <code>''t''.elementType</code>. Element ''i'' of the result is <code>ImplicitConvert(''val''[''i''], ''t''.elementType)</code>.
:::* If ''t'' is not well-formed UTF-16, fail.
:::* Let ''s'' = a sequence of bytes, the result of converting ''val'' from UTF-16 to UTF-8.
:::* Let ''n'' = the number of bytes in ''s''.
:::* If <code>''t''.length &lt; ''n''</code>, fail.
:::* The result is an array of type ''t'' whose first ''n'' elements are the 8-bit values in ''s''. If <code>''t''.length &gt; ''n''</code>, then element ''n'' of the result is 0. The values of the rest of the array elements are unspecified.
::* Otherwise fail.
 
:* If ''val'' is a JavaScript array object:
::* If <code>''val''.length</code> is not a nonnegative integer, fail.
::* If <code>''val''.length !== ''t''.length</code>, fail.
::* Otherwise, the result is a C/C++ array of <code>''val''.length</code> elements of type <code>''t''.elementType</code>. Element ''i'' of the result is <code>ImplicitConvert(''val''[''i''], ''t''.elementType)</code>.
:* Otherwise fail. ''(Rationale: The clause "If ''val'' is a JavaScript array object" requires some justification. If we allowed arbitrary JavaScript objects that resemble arrays, that would include CData objects of array type. Consequently, <code>arr1.value = arr2</code> where <code>arr1</code> is of type <code>ctypes.uint8_t.array(30)</code> and <code>arr2</code> is of type <code>ctypes.int.array(30)</code> would work as long as the values in <code>arr2</code> are small enough. We considered this conversion too astonishing and too error-prone.)''


* Otherwise ''t'' is a struct type.
* Otherwise ''t'' is a struct type.
Line 464: Line 513:
* If ''t'' is a pointer type and ''val'' is a number, <code>Int64</code> object, or <code>UInt64</code> object that can be exactly represented as an <code>intptr_t</code> or <code>uintptr_t</code>, the result is the same as casting that <code>intptr_t</code> or <code>uintptr_t</code> value to type ''t'' with a C-style cast.
* If ''t'' is a pointer type and ''val'' is a number, <code>Int64</code> object, or <code>UInt64</code> object that can be exactly represented as an <code>intptr_t</code> or <code>uintptr_t</code>, the result is the same as casting that <code>intptr_t</code> or <code>uintptr_t</code> value to type ''t'' with a C-style cast.


* If ''t'' is a pointer type, pointer-sized type, or 64-bit type, and ''val'' is a string consisting entirely of an optional minus sign, followed by the characters "0x" or "0X", followed by one or more hexadecimal digits, then the result is the same as casting the number named by ''val'' to type ''t'' with a C-style cast.
* If ''t'' is an integer type (not a character type) and ''val'' is a string consisting entirely of an optional minus sign, followed by either one or more decimal digits or the characters "0x" or "0X" and one or more hexadecimal digits, then the result is the same as casting the integer named by ''val'' to type ''t'' with a C-style cast.


* Otherwise fail.
* Otherwise fail.
Line 471: Line 520:
*If ''t'' and ''u'' represent the same built-in type, even <code>void</code>, return true.
*If ''t'' and ''u'' represent the same built-in type, even <code>void</code>, return true.
*If they are both pointer types, return <code>SameType(''t''.targetType, ''u''.targetType)</code>.
*If they are both pointer types, return <code>SameType(''t''.targetType, ''u''.targetType)</code>.
*If they are both array types, return <code>SameType(''t''.elementType, ''u''.targetType) &amp;&amp; ''t''.length === ''u''.length</code>.
*If they are both array types, return <code>SameType(''t''.elementType, ''u''.elementType) &amp;&amp; ''t''.length === ''u''.length</code>.
*If they are both struct types, return <code>''t'' === ''u''</code>.
*If they are both struct types, return <code>''t'' === ''u''</code>.
*Otherwise return false.
*Otherwise return false.
Line 491: Line 540:
  let ret = myfunc(2); // calls myfunc
  let ret = myfunc(2); // calls myfunc


Note that for simple types (integers and strings), we will autoconvert the argument at call time - there's no need to pass in a <code>ctypes.int32_t</code> object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a <code>ctypes.int32_t</code> object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.


Here is how to create an object of type <code>int32_t</code>:
Here is how to create an object of type <code>int32_t</code>:
Line 499: Line 548:
This allocates a new C++ object of type <code>int32_t</code> (4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.
This allocates a new C++ object of type <code>int32_t</code> (4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.


(Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you.)
Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you:


  let myfunc = mylib.declare("myfunc", ctypes.default_abi,
  let myfunc = mylib.declare("myfunc", ctypes.default_abi,
Line 509: Line 558:
<code>ctypes.int32_t</code> is a <code>CType</code>. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined <code>CTypes</code> such as structs and pointers also - see later.)
<code>ctypes.int32_t</code> is a <code>CType</code>. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined <code>CTypes</code> such as structs and pointers also - see later.)


This kind of object is called a <code>CData</code> object, and they are described in detail in the "<code>CData</code> objects" section above.
The object created by <code>new ctypes.int32_t</code> is called a <code>CData</code> object, and they are described in detail in the "<code>CData</code> objects" section above.


Opaque pointers:
Opaque pointers:


  // A new opaque pointer type.
  // A new opaque pointer type.
  FILE_ptr = new ctypes.PointerType("FILE *");
  FILE_ptr = new ctypes.StructType("FILE").ptr;
   
   
  let fopen = mylib.declare("fopen", ctypes.default_abi,
  let fopen = mylib.declare("fopen", ctypes.default_abi,
     FILE_ptr, ctypes.string, ctypes.string);
     FILE_ptr, ctypes.char.ptr, ctypes.char.ptr);
  let file = fopen("foo", "r");
  let file = fopen("foo", "r");
  if (file.isNull())
  if (file.isNull())
     throw "fopen failed";
     throw "fopen failed";
  file.contents(); // TypeError: type is unknown
  file.contents(); // TypeError: type is unknown
''(Open issue: <code>fopen("foo", "r")</code> does not work under js-ctypes as currently specified.)''


Declaring a struct:
Declaring a struct:
Line 565: Line 616:
   
   
  // cast from (uint32_t *) to (uint8_t *)
  // cast from (uint32_t *) to (uint8_t *)
  let q = ctypes.cast(ctypes.uint8_t.ptr, p);
  let q = ctypes.cast(p, ctypes.uint8_t.ptr);
   
   
  // first byte of buffer
  // first byte of buffer
Line 599: Line 650:
  // Declare a function that returns a 64-bit unsigned int.
  // Declare a function that returns a 64-bit unsigned int.
  const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
  const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
     ctypes.uint64_t, ctypes.string);
     ctypes.uint64_t, ctypes.char.ptr);
   
   
  // This autoconverts to a UInt64 object, not a JS number, even though the
  // This autoconverts to a UInt64 object, not a JS number, even though the
Line 644: Line 695:
                 // (because m.x's getter autoconverts to an Int64 object)
                 // (because m.x's getter autoconverts to an Int64 object)
  getint64(ctypes.addressOfField(m, 'x')); // works
  getint64(ctypes.addressOfField(m, 'x')); // works
''(Open issue: As above, the implicit conversion from JS string to <code>char *</code> in <code>getfilesize("/usr/share/dict/words")</code> does not work in js-ctypes as specified.)''


''(TODO - make this a real example:)''
''(TODO - make this a real example:)''
Line 652: Line 705:


=Future directions=
=Future directions=
* Callbacks (JITting native wrappers that conform to a given C/C++ function-pointer type and call a JS function. Finding the right cx to use will be tricky.)
* Array slices, a way to get a <code>CData</code> object that acts like a view on a window of an array. E.g. ''carray''.slice(start, stop). Assigning one slice to another would memcpy.


=Implementation notes=
==Callbacks==
'''The ctypes instance.''' Currently, via ctypes.jsm, we instantiate a fresh ctypes instance each time the module is imported - the global 'ctypes' property will be a different object each time. This doesn't matter right now, because ctypes doesn't have any prototype objects of its own - just object constants and functions. With the API proposal above, it will have a CType object that serves as prototype for the other types. With this in mind, do we want to have 1) a single ctypes instance, which we stash in a C++ static global and hand out on demand; or 2) a separate ctypes instance per import?
 
The libffi part of this is presumably not too bad. Issues:
 
'''Lifetimes.''' C/C++ makes it impossible to track an object pointer. Both JavaScript's GC and experience with C/C++ function pointers will tend to discourage users from caring about function lifetimes.
 
I think the best solution to this problem is to put the burden of keeping the function alive entirely on the client.
 
'''Finding the right context to use.''' If we burn the cx right into the libffi closure, it will crash when called from a different thread or after the cx is destroyed. If we take a context at random from some internal JSAPI structure, it might be thread-safe, but the context's options and global will be random, which sounds dangerous. Perhaps ctypes itself can create a context per thread, on demand, for the use of function pointers. In a typical application, that would only create one context, if any.
 
==Converting strings==
 
I think we want an explicit API for converting strings, very roughly:
 
<code>CData</code> objects of certain pointer and array types have methods for reading and writing Unicode strings. These methods are present if the target or element type is an 8-bit character or integer type.
 
'''<code>''cdata''.readString(''[encoding[, length]]'')</code>''' - Read bytes from ''cdata'' and convert them to Unicode characters using the specified ''encoding'', returning a string. Specifically:
* If ''cdata'' is an array, let ''p'' = a pointer to the first element. Otherwise ''cdata'' is a pointer; let ''p'' = the value of ''cdata''.
* If ''encoding'' is <code>undefined</code> or omitted, the selected encoding is UTF-8. Otherwise, if ''encoding'' is a string naming a known character encoding, that encoding is selected. Otherwise throw a <code>TypeError</code>.
* If ''length'' is a size value, ''cdata'' is an array, and <code>''length'' &gt; ''cdata''.length</code>, then throw a <code>TypeError</code>.
* Otherwise, if ''length'' is a size value, take exactly ''length'' bytes starting at ''p'' and convert them to Unicode characters according to the selected encoding. ''(Open issue: Error handling.)'' Return a JavaScript string containing the Unicode characters, represented in UTF-16.  ''(The result may contain null characters.)''
* Otherwise, if ''length'' is <code>undefined</code> or omitted, convert bytes starting at ''p'' to Unicode characters according to the selected encoding. Stop when the end of the array is reached (if ''cdata'' is an array) or when a null character (U+0000) is found. ''(Open issue: Error handling.)'' Return a JavaScript string containing the Unicode characters, represented in UTF-16.  ''(If ''cdata'' is a pointer and there is no trailing null character, this can crash.)''
* Otherwise throw a <code>TypeError</code>.
 
'''<code>''cdata''.writeString(''s'', ''[encoding[, length]]'')</code>''' - Determine the starting pointer ''p'' as above. If ''s'' is not a well-formed UTF-16 string, throw a <code>TypeError</code>.  ''(Open issue: Error handling.)'' Otherwise convert ''s'' to bytes in the specified ''encoding'' (default: UTF-8) and write at most ''length'' - 1 bytes, or all the converted bytes, if ''length'' is <code>undefined</code> or omitted, to memory starting at ''p''. Write a converted null character after the data. Return the number of bytes of data written, not counting the terminating null character.
 
''(Open issue: ''<code>''cdata''.writeString(...)</code>'' is awkward for the case where you want an autosized <code>ctypes.char.array()</code> to hold the converted data. If <code>''cdata''</code> happens to be too small for the resulting string, and you don't supply ''length'', you crash; and if you do supply ''length'', you don't know whether conversion was halted because the target array was of insufficient length.)''
 
''(Open issue: As proposed, these are not suitable for working with encodings where a zero byte might not indicate the end of text. For example, a string encoded in UTF-16 will typically contain a lot of zero bytes. Unfortunately, in the case of readString, the underlying library demands the length up front.)''
 
''(Open issue: These methods offer no error handling options, which is pretty weak. Real-world code often wants to allow a few characters to be garbled rather than fail. For now we will likely be limited to whatever the underlying codec library, <code>nsIScriptableUnicodeConverter</code>, can do.)''
 
''(Open issue: 16-bit versions too, for UTF-16?)''
 
==isNull==
 
If we do not convert NULL pointers to JS <code>null</code> (and I may have changed my mind about this) then we need:
 
'''<code>''cptr''.isNull()</code>''' - Return <code>true</code> if ''cptr''<nowiki>'</nowiki>s value is a null pointer, <code>false</code> otherwise.
 
==Auto-converting strings==
 
There are several issues:
 
'''Lifetimes.''' This problem arises when autoconverting from JS to C/C++ only.
 
When passing a string to a foreign function, like <code>foo(s)</code>, what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing <code>s</code> for the duration of the call. But then there are situations like
 
TenStrings = char.ptr.array(10);
var arr = new TenStrings();
arr[0] = s;  // What is the lifetime of the data arr[0] points to?
 
The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.
 
'''Non-null-terminated strings.''' This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).
 
In C/C++, the type <code>char *</code> effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing <code>char *</code> pointers that aren't logically strings). The workaround would be to declare them as a different type.
 
'''Unicode.''' This problem does not apply to conversions between JS strings and <code>char16_t</code> arrays or pointers; only <code>char</code> arrays or pointers.


* With 1), I'm not sure if there would be issues with having a single ctypes object span multiple JSRuntimes or somesuch. If so, we might be stuck. If not, we will need to carefully seal the ctypes object and its children, and it won't be able to have a __parent__. Threadsafety should be trivial; at most we will need to make the ctypes init function safe against reentrance.
Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and <code>char</code> strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings and, for those, <code>char16_t</code> is the right thing.


* With 2), we will have different CType proto objects per ctypes instance. Will this be an issue for code wanting to pass ctypes objects between module instances? ctypes does not internally depend on prototype object equality, only on JSClass equality, so it will work. Consumers will have trouble if they want to compare prototype objects, however. I'm not sure if this is a big deal. Also, doing this means that we need to stash the CType proto object somewhere, so it's accessible from C++. I don't think we can depend on having the 'ctypes' object hanging off the global object - unless we make it readonly - in which case we need to have the CType proto object hanging off each object we create. (Which we should get for free, as long as all the type object __proto__'s are readonly!) So, either we make the 'ctypes' property readonly, or we make {every type object}.__proto__ readonly, or both.
'''Casting away const.''' This problem arises only when converting from a JS string to a C/C++ pointer type. The string data must not be modified, but the C/C++ types <code>char *</code> and <code>char16_t *</code> suggest that the referent might be modified.
Confirmed users
3,339

edits