Jsctypes/api: Difference between revisions

Jump to navigation Jump to search
2,273 bytes added ,  20 October 2009
Remove the string types. Auto-convert from JS strings to C/C++ array-of-char types, but not vice versa.
(Remove the string types. Auto-convert from JS strings to C/C++ array-of-char types, but not vice versa.)
Line 38: Line 38:


:'''<code>ctypes.jschar</code>''' - A 16-bit unsigned character type. (This is distinct from <code>uint8_t</code> in details of conversion behavior. js-ctypes autoconverts C <code>jschar</code>s to JavaScript strings of length 1.)
:'''<code>ctypes.jschar</code>''' - A 16-bit unsigned character type. (This is distinct from <code>uint8_t</code> in details of conversion behavior. js-ctypes autoconverts C <code>jschar</code>s to JavaScript strings of length 1.)
:'''<code>ctypes.string, ustring</code>''' - String types. The C/C++ type for <code>ctypes.string</code> is <code>const char *</code>. C/C++ values of this type must be either <code>null</code> or pointers to null-terminated strings. <code>ctypes.ustring</code> is the same, but for <code>const jschar *</code>; that is, the code units of the string are <code>uint16_t</code>.


:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type.  (<code>void</code> is a keyword in JavaScript.)
:'''<code>ctypes.void_t</code>''' - The special C type <code>void</code>. This can be used as a return value type.  (<code>void</code> is a keyword in JavaScript.)
Line 101: Line 99:
  ctypes.void_t.name
  ctypes.void_t.name
   ===> "void"
   ===> "void"
  ctypes.ustring.name
  ctypes.jschar.ptr.name
   ===> "const jschar *"
   ===> "jschar *"
   
   
  const FILE = new ctypes.PointerType("FILE *");
  const FILE = new ctypes.PointerType("FILE *");
Line 117: Line 115:
     new ctypes.PointerType(
     new ctypes.PointerType(
       new ctypes.PointerType(
       new ctypes.PointerType(
         new ctypes.ArrayType(ctypes.string, 4)));
         new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4)));
  ptrTo_ptrTo_arrayOf4_strings.name
  ptrTo_ptrTo_arrayOf4_strings.name
   ===> "const char *(**)[4]"
   ===> "char *(**)[4]"


:'''<code>''t''.ptr</code>''' - Return <code>ctypes.PointerType(''t'')</code>.
:'''<code>''t''.ptr</code>''' - Return <code>ctypes.PointerType(''t'')</code>.
Line 129: Line 127:
:Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
:Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:


  const ptrTo_ptrsTo_arrayOf4_strings = ctypes.string.array(4).ptr.ptr;
  const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;


:''(<code>.array()</code> requires parentheses but <code>.ptr</code> doesn't. Rationale: <code>.array()</code> has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:  <code>int [10]</code> --> <code>ctypes.int.array(10)</code>. Writing a pointer type does not require the brackets.)''
:''(<code>.array()</code> requires parentheses but <code>.ptr</code> doesn't. Rationale: <code>.array()</code> has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:  <code>int [10]</code> --> <code>ctypes.int.array(10)</code>. Writing a pointer type does not require the brackets.)''
Line 175: Line 173:
:Every <code>CType</code> has a read-only, permanent <code>.prototype</code> property.  The type-constructors <code>ctypes.{C,Pointer,Struct,Array}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.
:Every <code>CType</code> has a read-only, permanent <code>.prototype</code> property.  The type-constructors <code>ctypes.{C,Pointer,Struct,Array}Type</code> each have a read-only, permanent <code>.prototype</code> property as well.


:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.CType.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer}Type.prototype</code> and of all the builtin types except for the string types and <code>ctypes.voidptr_t</code> is <code>ctypes.CType.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a string type or pointer type is <code>ctypes.PointerType.prototype</code>.
:Types have a hierarchy of prototype objects. The prototype of <code>ctypes.CType.prototype</code> is <code>Function.prototype</code>. The prototype of <code>ctypes.{Array,Struct,Pointer}Type.prototype</code> and of all the builtin types except <code>ctypes.voidptr_t</code> is <code>ctypes.CType.prototype</code>. The prototype of an array type is <code>ctypes.ArrayType.prototype</code>. The prototype of a struct type is <code>ctypes.StructType.prototype</code>. The prototype of a pointer type is <code>ctypes.PointerType.prototype</code>.


:Every <code>CType</code> ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. The same is true of the four type constructors <code>ctypes.{C,Array,Struct,Pointer}Type</code>.
:Every <code>CType</code> ''t'' has <code>''t''.prototype.constructor === ''t''</code>; that is, its <code>.prototype</code> has a read-only, permanent, own <code>.constructor</code> property that refers to the type. The same is true of the four type constructors <code>ctypes.{C,Array,Struct,Pointer}Type</code>.
Line 267: Line 265:


:'''<code>''carray''.addressOfElement(''i'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type (<code>ctypes.PointerType(''carray''.constructor.elementType)</code>) whose value points to element ''i'' of ''carray''. If ''i'' is not a JavaScript number that is a valid index of ''carray'', throw a <code>TypeError</code>.
:'''<code>''carray''.addressOfElement(''i'')</code>''' - Return a new <code>CData</code> object of the appropriate pointer type (<code>ctypes.PointerType(''carray''.constructor.elementType)</code>) whose value points to element ''i'' of ''carray''. If ''i'' is not a JavaScript number that is a valid index of ''carray'', throw a <code>TypeError</code>.
''(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)''


== Aliasing ==
== Aliasing ==
Line 379: Line 379:
* If ''x'' is of type <code>jschar</code>, return a JavaScript string of length 1 containing the value of ''x'' (like <code>String.fromCharCode(x)</code>).
* If ''x'' is of type <code>jschar</code>, return a JavaScript string of length 1 containing the value of ''x'' (like <code>String.fromCharCode(x)</code>).


* If ''x'' is of any other character type, select the corresponding Unicode character. ''(Open issue: Unicode conversions.)'' Convert the character to UTF-16. Return a JavaScript string containing the UTF-16 code units. (If the character type is 1 or 2 bytes, as it is on all platforms we care about, the result is a one-character JavaScript string.)
* If ''x'' is of any other character type, select the corresponding Unicode character. ''(Open issue: Unicode conversions.)'' Convert the character to UTF-16. Return a JavaScript string containing the UTF-16 code units. (If the character type is 1 or 2 bytes, as it is on all platforms we care about, the result is a one-character JavaScript string.) ''(Note: If we ever support <code>wchar_t</code>, it might be best to autoconvert it to a number. On platforms where <code>wchar_t</code> is 32 bits, values over <code>0x10ffff</code> are not Unicode characters.)''
 
* If ''x'' is of a string type and is <code>NULL</code>, return <code>null</code>.
 
* If ''x'' is of type <code>cstring</code> and is non-null, transcode it to UTF-16 and return a JavaScript string containing the UTF-16 code units.  ''(Open issue: Unicode conversions.)''
 
* If ''x'' is of type <code>ustring</code> and is non-null, return a JavaScript string containing the same sequence of 16-bit characters.


* Otherwise ''x'' is of an array, struct, or pointer type. If the argument ''x'' is already a <code>CData</code> object, return it. Otherwise allocate a  buffer containing a copy of the C/C++ value ''x'', and return a <code>CData</code> object of the appropriate type referring to the object in the new buffer.
* Otherwise ''x'' is of an array, struct, or pointer type. If the argument ''x'' is already a <code>CData</code> object, return it. Otherwise allocate a  buffer containing a copy of the C/C++ value ''x'', and return a <code>CData</code> object of the appropriate type referring to the object in the new buffer.
Line 418: Line 412:
* If ''t'' is any other character type:
* If ''t'' is any other character type:
:* If ''val'' is a string:
:* If ''val'' is a string:
::* If the 16-bit elements of ''val'' are not the UTF-16 encoding of a single Unicode character, fail.
::* If the 16-bit elements of ''val'' are not the UTF-16 encoding of a single Unicode character, fail. ''(Open issue: If we support <code>wchar_t</code> we may want to allow unpaired surrogate code points to pass through without error.)''
::* If that Unicode character can be represented by a single character of type ''t'', the result is that character. ''(Open issue: Unicode conversions.)''
::* If that Unicode character can be represented by a single character of type ''t'', the result is that character. ''(Open issue: Unicode conversions.)''
::* Otherwise fail.
::* Otherwise fail.
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.  (This is sensitive to the signedness of ''t''.)
:* If ''val'' is a number that can be exactly represented as a value of type ''t'', the result is that value.  (This is sensitive to the signedness of ''t''.)
:* Otherwise fail.
* If ''t'' is a string type:
:* If ''val'' is <code>null</code>, the result is a C/C++ <code>NULL</code> pointer of type ''t''.
:* If ''val'' is a string and ''t'' is <code>ustring</code>, the result is a pointer to the first character of ''val''. The resulting pointer will remain valid as long as the string ''val'' remains reachable. (Note that if the characters of ''val'' are modified in any way, the consequences can be arbitrarily bad.)
:* If ''val'' is a string and ''t'' is <code>string</code>, convert the string to type ''t'' in an implementation-defined way. The resulting pointer will remain valid as long as the string ''val'' remains reachable. ''(Open issue: Unicode conversions.)''
:* Otherwise fail.
:* Otherwise fail.


Line 434: Line 422:
:* If ''val'' is a <code>CData</code> object of array type ''u'' and either ''t'' is <code>ctypes.voidptr_t</code> or <code>SameType(''t''.targetType, ''u''.elementType)</code>, return a pointer to the first element of the array.
:* If ''val'' is a <code>CData</code> object of array type ''u'' and either ''t'' is <code>ctypes.voidptr_t</code> or <code>SameType(''t''.targetType, ''u''.elementType)</code>, return a pointer to the first element of the array.
:* If ''t'' is <code>ctypes.voidptr_t</code> and ''val'' is a <code>CData</code> object of pointer type, return the value of the C/C++ pointer in ''val'', cast to <code>void *</code>.
:* If ''t'' is <code>ctypes.voidptr_t</code> and ''val'' is a <code>CData</code> object of pointer type, return the value of the C/C++ pointer in ''val'', cast to <code>void *</code>.
:* Otherwise fail.  ''(Rationale: We don't convert strings to pointers yet partly because we're lazy and partly because it would implicitly cast away const. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)''
:* Otherwise fail.  ''(Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)''


* If ''t'' is an array type:
* If ''t'' is an array type:
:* If ''val'' is not a JavaScript object, fail. ''(We could reasonably convert JS strings to arrays of char types, but let's skip it for now.)''
:* If ''val'' is a JavaScript string:
:* If <code>''val''.length</code> is not a nonnegative integer, fail.
::* If <code>''t''.elementType</code> is <code>jschar</code> and <code>''t''.length &gt;= ''val''.length</code>, the result is an array of type ''t'' whose first <code>''val''.length</code> elements are the 16-bit elements of ''val''. If <code>''t''.length &gt; ''val''.length</code>, then element <code>''val''.length</code> of the result is a null character. The values of the rest of the array elements are unspecified.
:* If <code>''val''.length !== ''t''.length</code>, fail.
::* If <code>''t''.elementType</code> is an 8-bit character type:
:* Otherwise, the result is a C/C++ array of <code>''val''.length</code> elements of type <code>''t''.elementType</code>. Element ''i'' of the result is <code>ImplicitConvert(''val''[''i''], ''t''.elementType)</code>.
:::* If ''t'' is not well-formed UTF-16, fail.
:::* Let ''s'' = a sequence of bytes, the result of converting ''val'' from UTF-16 to UTF-8.
:::* Let ''n'' = the number of bytes in ''s''.
:::* If <code>''t''.length &lt; ''n''</code>, fail.
:::* The result is an array of type ''t'' whose first ''n'' elements are the 8-bit values in ''s''. If <code>''t''.length &gt; ''n''</code>, then element ''n'' of the result is 0. The values of the rest of the array elements are unspecified.
::* Otherwise fail.
:* If ''val'' is a JavaScript object:
::* If <code>''val''.length</code> is not a nonnegative integer, fail.
::* If <code>''val''.length !== ''t''.length</code>, fail.
::* Otherwise, the result is a C/C++ array of <code>''val''.length</code> elements of type <code>''t''.elementType</code>. Element ''i'' of the result is <code>ImplicitConvert(''val''[''i''], ''t''.elementType)</code>.
:* Otherwise fail.


* Otherwise ''t'' is a struct type.
* Otherwise ''t'' is a struct type.
Line 464: Line 462:
* If ''t'' is a pointer type and ''val'' is a number, <code>Int64</code> object, or <code>UInt64</code> object that can be exactly represented as an <code>intptr_t</code> or <code>uintptr_t</code>, the result is the same as casting that <code>intptr_t</code> or <code>uintptr_t</code> value to type ''t'' with a C-style cast.
* If ''t'' is a pointer type and ''val'' is a number, <code>Int64</code> object, or <code>UInt64</code> object that can be exactly represented as an <code>intptr_t</code> or <code>uintptr_t</code>, the result is the same as casting that <code>intptr_t</code> or <code>uintptr_t</code> value to type ''t'' with a C-style cast.


* If ''t'' is a pointer type, pointer-sized type, or 64-bit type, and ''val'' is a string consisting entirely of an optional minus sign, followed by the characters "0x" or "0X", followed by one or more hexadecimal digits, then the result is the same as casting the number named by ''val'' to type ''t'' with a C-style cast.
* If ''t'' is a pointer type, pointer-sized type, or 64-bit type, and ''val'' is a string consisting entirely of an optional minus sign, followed by the characters "0x" or "0X", followed by one or more hexadecimal digits, then the result is the same as casting the number named by ''val'' to type ''t'' with a C-style cast. ''(Open issue: The conversion from string to pointer is very likely to be dropped. It would collide with auto-converting strings to pointer types like <code>char.ptr</code>, and the workaround is pretty straightforward:  <code>''t''(''val'') &rarr; ''t''(uintptr_t(''val''))</code>.)''


* Otherwise fail.
* Otherwise fail.
Line 491: Line 489:
  let ret = myfunc(2); // calls myfunc
  let ret = myfunc(2); // calls myfunc


Note that for simple types (integers and strings), we will autoconvert the argument at call time - there's no need to pass in a <code>ctypes.int32_t</code> object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a <code>ctypes.int32_t</code> object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.


Here is how to create an object of type <code>int32_t</code>:
Here is how to create an object of type <code>int32_t</code>:
Line 517: Line 515:
   
   
  let fopen = mylib.declare("fopen", ctypes.default_abi,
  let fopen = mylib.declare("fopen", ctypes.default_abi,
     FILE_ptr, ctypes.string, ctypes.string);
     FILE_ptr, ctypes.char.ptr, ctypes.char.ptr);
  let file = fopen("foo", "r");
  let file = fopen("foo", "r");
  if (file.isNull())
  if (file.isNull())
     throw "fopen failed";
     throw "fopen failed";
  file.contents(); // TypeError: type is unknown
  file.contents(); // TypeError: type is unknown
''(Open issue: <code>fopen("foo", "r")</code> does not work under js-ctypes as currently specified.)''


Declaring a struct:
Declaring a struct:
Line 599: Line 599:
  // Declare a function that returns a 64-bit unsigned int.
  // Declare a function that returns a 64-bit unsigned int.
  const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
  const getfilesize = mylib.declare("getfilesize", ctypes.default_abi,
     ctypes.uint64_t, ctypes.string);
     ctypes.uint64_t, ctypes.char.ptr);
   
   
  // This autoconverts to a UInt64 object, not a JS number, even though the
  // This autoconverts to a UInt64 object, not a JS number, even though the
Line 644: Line 644:
                 // (because m.x's getter autoconverts to an Int64 object)
                 // (because m.x's getter autoconverts to an Int64 object)
  getint64(ctypes.addressOfField(m, 'x')); // works
  getint64(ctypes.addressOfField(m, 'x')); // works
''(Open issue: As above, the implicit conversion from JS string to <code>char *</code> in <code>getfilesize("/usr/share/dict/words")</code> does not work in js-ctypes as specified.)''


''(TODO - make this a real example:)''
''(TODO - make this a real example:)''
Line 654: Line 656:
* Callbacks (JITting native wrappers that conform to a given C/C++ function-pointer type and call a JS function. Finding the right cx to use will be tricky.)
* Callbacks (JITting native wrappers that conform to a given C/C++ function-pointer type and call a JS function. Finding the right cx to use will be tricky.)
* Array slices, a way to get a <code>CData</code> object that acts like a view on a window of an array. E.g. ''carray''.slice(start, stop). Assigning one slice to another would memcpy.
* Array slices, a way to get a <code>CData</code> object that acts like a view on a window of an array. E.g. ''carray''.slice(start, stop). Assigning one slice to another would memcpy.
==Auto-converting strings==
There are several issues:
# '''Lifetimes.''' This problem arises when autoconverting from JS to C/C++ only.
:When passing a string to a foreign function, like <code>foo(s)</code>, what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing <code>s</code> for the duration of the call. But then there are situations like
TenStrings = char.ptr.array(10);
var arr = new TenStrings();
arr[0] = s;  // What is the lifetime of the data arr[0] points to?
:The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.
# '''Non-null-terminated strings.''' This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).
:In C/C++, the type <code>char *</code> effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing <code>char *</code> pointers that aren't logically strings). The workaround would be to declare them as a different type.
# '''Unicode.''' This problem does not apply to conversions between JS strings and <code>jschar</code> arrays or pointers; only <code>char</code> arrays or pointers.
:Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and <code>char</code> strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings, and for those <code>jschar</code> is the right thing.
# '''Casting away const.''' This problem arises only when converting from a JS string to a C/C++ pointer type.  The string data must not be modified, but the C/C++ types <code>char *</code> and <code>jschar *</code> suggest that the referent might be modified.


=Implementation notes=
=Implementation notes=
638

edits

Navigation menu