ServerJS/Binary/B: Difference between revisions

Jump to navigation Jump to search
s/codec/charset/ in most places, and noted that ByteString().toByteString() can return itself instead of making a copy since it is immutable.
(s/codec/charset/ in most places, and noted that ByteString().toByteString() can return itself instead of making a copy since it is immutable.)
Line 3: Line 3:
= Philosophy =
= Philosophy =


This proposal is not an object oriented variation on pack and unpack with notions of inherent endianness, read/write head position, or intrinsic codec information.  The objects described in this proposal are merely for the storage and direct manipulation of strings and arrays of byte data.  Some object oriented conveniences are made, but the exercise of implementing pack, unpack, or an object-oriented analog thereof are left as an exercise for a future proposal of a more abstract type or a 'struct' module (as mentioned by Ionut Gabriel Stan on [http://groups.google.com/group/serverjs/msg/592442ba98c6c70e the list]).  This goes against most mentioned [[ServerJS/Binary|prior art]].
This proposal is not an object oriented variation on pack and unpack with notions of inherent endianness, read/write head position, or intrinsic codec or charset information.  The objects described in this proposal are merely for the storage and direct manipulation of strings and arrays of byte data.  Some object oriented conveniences are made, but the exercise of implementing pack, unpack, or an object-oriented analog thereof are left as an exercise for a future proposal of a more abstract type or a 'struct' module (as mentioned by Ionut Gabriel Stan on [http://groups.google.com/group/serverjs/msg/592442ba98c6c70e the list]).  This goes against most mentioned [[ServerJS/Binary|prior art]].


This proposal also does not provide named member functions for any particular subset of the possible codecs or digests that might operate on a byte string or array.  Instead, convenience member functions are provided for interfacing with any named codec or digest module, assuming that the given module exports the specified interface. (As supported originally by Robert Schultz, Davey Waterson, Ross Boucher, and tacitly myself, Kris Kowal, on the [http://groups.google.com/group/serverjs/browse_thread/thread/be72ef3d8146731d/06c27162b698eef5?lnk=gst First proposition] thread on the mailing list).  This proposal does not address the need for stream objects to support pipelined codecs and hash digests (mentioned by Tom Robinson and Robert Schultz in the same conversation).
This proposal also does not provide named member functions for any particular subset of the possible charsets, codecs, compression algorithms, or digests that might operate on a byte string or array.  Instead, convenience member functions are provided for interfacing with any named codec or digest module, assuming that the given module exports the specified interface. (As supported originally by Robert Schultz, Davey Waterson, Ross Boucher, and tacitly myself, Kris Kowal, on the [http://groups.google.com/group/serverjs/browse_thread/thread/be72ef3d8146731d/06c27162b698eef5?lnk=gst First proposition] thread on the mailing list).  This proposal does not address the need for stream objects to support pipelined codecs and hash digests (mentioned by Tom Robinson and Robert Schultz in the same conversation).


This proposal also reflects both group sentiment and a pragmatic point about properties.  This isn't a decree that properties like "length" should be consistently used throughout the ServerJS APIs.  However, given that all platforms support properties at the native level (to host String and Array objects) and that byte strings and arrays will require support at the native level, pursuing client-side interoperability is beyond the scope of this proposal and therefore properties have been specified.  (See comments by Kris Zyp about the implementability of properties in all platforms, comments by Davey Waterson from Aptana about the counter-productivity of attempting to support this API in browsers, and support properties over accessor and mutator functions by Ionut Gabriel Stand and Cameron McCormack on the [http://groups.google.com/group/serverjs/browse_thread/thread/be72ef3d8146731d/06c27162b698eef5?lnk=gst mailing list]).
This proposal also reflects both group sentiment and a pragmatic point about properties.  This isn't a decree that properties like "length" should be consistently used throughout the ServerJS APIs.  However, given that all platforms support properties at the native level (to host String and Array objects) and that byte strings and arrays will require support at the native level, pursuing client-side interoperability is beyond the scope of this proposal and therefore properties have been specified.  (See comments by Kris Zyp about the implementability of properties in all platforms, comments by Davey Waterson from Aptana about the counter-productivity of attempting to support this API in browsers, and support properties over accessor and mutator functions by Ionut Gabriel Stand and Cameron McCormack on the [http://groups.google.com/group/serverjs/browse_thread/thread/be72ef3d8146731d/06c27162b698eef5?lnk=gst mailing list]).


The byte types provide functions for encoding, decoding, and transcoding, but they are all shallow interfaces that defer to a codec manager module, and may in turn use a system level codec or use a pair of pure JavaScript modules to transcode through an array or stream of canonical Unicode code points.
The byte types provide functions for encoding, decoding, and transcoding, but they are all shallow interfaces that defer to a charset manager module, and may in turn use a system level charset or use a pair of pure JavaScript modules to transcode through an array or stream of canonical Unicode code points.


= Specification =
= Specification =
Line 28: Line 28:
: Use the numbers in arrayOfNumbers as the bytes.
: Use the numbers in arrayOfNumbers as the bytes.
: If any element is outside the range 0...255, an exception (''TODO'') is thrown.
: If any element is outside the range 0...255, an exception (''TODO'') is thrown.
; ByteString(string, codec)
; ByteString(string, charset)
: Convert a string. The ByteString will contain string encoded with codec.
: Convert a string. The ByteString will contain string encoded with charset.


=== Instance properties ===
=== Instance properties ===
Line 43: Line 43:
: Returns a transcoded copy in a ByteArray.
: Returns a transcoded copy in a ByteArray.
; toByteString()
; toByteString()
: Copy.
: Returns itself, since there's no need to copy an immutable ByteString.
; toByteString(sourceCodec, targetCodec)
; toByteString(sourceCodec, targetCodec)
: Returns a transcoded copy.
: Returns a transcoded copy.
; toArray()
; toArray()
: Returns an array containing the bytes as numbers.
: Returns an array containing the bytes as numbers.
; toArray(codec)
; toArray(charset)
: Returns an array containing the decoded Unicode code points.
: Returns an array containing the decoded Unicode code points.
; toString()
; toString()
: Returns a debug representation like "[ByteString 10]", where 10 is the length of the Array.
: Returns a debug representation like "[ByteString 10]", where 10 is the length of the Array.
; decodeToString(codec)
; decodeToString(charset)
: Returns the decoded ByteArray as a string.
: Returns the decoded ByteArray as a string.
; indexOf(byte)
; indexOf(byte)
Line 88: Line 88:
* valueOf()
* valueOf()


ByteString does not implement toUpperCase() or toLowerCase() since they are not meaningful without the context of a codec.
ByteString does not implement toUpperCase() or toLowerCase() since they are not meaningful without the context of a charset.


== ByteArray ==
== ByteArray ==
Line 107: Line 107:
: Use numbers in arrayOfBytes as contents.
: Use numbers in arrayOfBytes as contents.
: Throws an exception if any element is outside the range 0...255 (''TODO'').
: Throws an exception if any element is outside the range 0...255 (''TODO'').
; ByteArray(string, codec)
; ByteArray(string, charset)
: Create a ByteArray from a Javascript string, the result being encoded with codec.
: Create a ByteArray from a Javascript string, the result being encoded with charset.


Unlike the Array, the ByteArray is not variadic so that its initial length constructor is not ambiguous with its copy constructor.
Unlike the Array, the ByteArray is not variadic so that its initial length constructor is not ambiguous with its copy constructor.
Line 122: Line 122:


* toArray() -> an array of the byte values
* toArray() -> an array of the byte values
* toArray(codec) -> an array of the code points, decoded
* toArray(charset) -> an array of the code points, decoded
* toString() -> a string representation like "[ByteArray 10]"
* toString() -> a string representation like "[ByteArray 10]"
* decodeToString(codec) - decoded
* decodeToString(charset) - decoded
* toByteArray() -> just a copy
* toByteArray() -> just a copy
* toByteArray(sourceCodec, targetCodec) -> transcoded
* toByteArray(sourceCodec, targetCodec) -> transcoded
Line 148: Line 148:
The String prototype will be extended with the following members:
The String prototype will be extended with the following members:


; toByteArray(codec)
; toByteArray(charset)
: Converts a string to a ByteArray encoded in codec.
: Converts a string to a ByteArray encoded in charset.
; toByteString(codec)
; toByteString(charset)
: Converts a string to a ByteString encoded in codec.
: Converts a string to a ByteString encoded in charset.
; charCodes()
; charCodes()
: Returns an array of Unicode code points (as numbers).
: Returns an array of Unicode code points (as numbers).
Line 159: Line 159:
The Array prototype will be extended with the following members:
The Array prototype will be extended with the following members:


; toByteArray(codec)
; toByteArray(charset)
: Converts an array of Unicode code points to a ByteArray encoded in codec.
: Converts an array of Unicode code points to a ByteArray encoded in charset.
; toByteString(codec)
; toByteString(charset)
: Converts an array of Unicode code points to a ByteString encoded in codec.
: Converts an array of Unicode code points to a ByteString encoded in charset.


== General Requirements ==
== General Requirements ==
171

edits

Navigation menu