ServerJS/Binary/B
All platforms support two types for interacting with binary data: ByteArray and ByteString. The ByteArray type resembles the interface of Array in that it is mutable, extensible, and indexing will return number values for the byte in the given position, or undefined. The ByteString type resembles the interface of String in that it is immutable and indexing returns a ByteString of length 1. These types are exported by the 'binary' top-level module. (The idea of using these particular two types and their respective names originated with Jason Orendorff in the Binary API Brouhaha discussion.)
Philosophy
This proposal is not an object oriented variation on pack and unpack with notions of inherent endianness, read/write head position, or intrinsic codec information. The objects described in this proposal are merely for the storage and direct manipulation of strings and arrays of byte data. Some object oriented conveniences are made, but the exercise of implementing pack, unpack, or an object-oriented analog thereof are left as an exercise for a future proposal of a more abstract type or a 'struct' module (as mentioned by Ionut Gabriel Stan on the list). This goes against most mentioned prior art.
This proposal also does not provide named member functions for any particular subset of the possible codecs or digests that might operate on a byte string or array. Instead, convenience member functions are provided for interfacing with any named codec or digest module, assuming that the given module exports the specified interface. (As supported originally by Robert Schultz, Davey Waterson, Ross Boucher, and tacitly myself, Kris Kowal, on the First proposition thread on the mailing list). This proposal does not address the need for stream objects to support pipelined codecs and hash digests (mentioned by Tom Robinson and Robert Schultz in the same conversation).
This proposal also reflects both group sentiment and a pragmatic point about properties. This isn't a decree that properties like "length" should be consistently used throughout the ServerJS APIs. However, given that all platforms support properties at the native level (to host String and Array objects) and that byte strings and arrays will require support at the native level, pursuing client-side interoperability is beyond the scope of this proposal and therefore properties have been specified. (See comments by Kris Zyp about the implementability of properties in all platforms, comments by Davey Waterson from Aptana about the counter-productivity of attempting to support this API in browsers, and support properties over accessor and mutator functions by Ionut Gabriel Stand and Cameron McCormack on the mailing list).
ByteString
A ByteString is an immutable, fixed-width representation of a C unsigned char (byte) array. ByteString supports the String API, and indexing returns a byte substring of length 1.
The ByteString constructor accepts:
- ByteString()
- ByteString(byteString)
- ByteString(byteArray)
- ByteString(array)
- ByteString(string, codecModuleId)
The ByteString object has the following methods:
- encode(string, codecModuleId)
ByteString instances support the following:
- immutable length property
- toByteArray()
- toArray()
- toString(codecModuleId)
- decode(codecModuleId)
- hash(digestModuleId)
- compress(compressionModuleId)
- indexOf(Number or ByteString)
- lastIndexOf(Number or ByteString)
- charAt(offset) -> ByteString
- charCodeAt(offset) -> Number
- byteAt(offset) -> Number (same as charCodeAt)
- split(Number or ByteString) -> Array of ByteStrings
- substring(first, last) or substring(first) to the end
- substr(first, length) or substr(length)
- The + operator returning new ByteStrings
- The immutable [] operator returning ByteStrings
- toSource() which would return "ByteString([])" for a null byte string
- valueOf() returns itself
ByteString does not implement toUpperCase() or toLowerCase().
ByteArray
A ByteArray is a mutable, flexible representation of a C unsigned char (byte) array.
The ByteArray constructor has the following forms:
- ByteArray()
- ByteArray(length)
- ByteArray(byteArray)
- ByteArray(byteString)
- ByteArray(array)
- ByteString(string, codecModuleId)
Unlike the Array, the ByteArray is not variadic so that its initial length constructor is not ambiguous with its copy constructor.
The ByteArray object has the following methods:
- encode(string, codecModuleId)
ByteArray instances support the following:
- mutable length property
- extending a byte array fills the new entries with 0.
- toByteString()
- toArray()
- toString(codecModuleId)
- decode(codecModuleId) returns String
- hash(digestModuleId)
- compress(compressionModuleId)
- concat(iterable)
- join(byteString byteArray or Number)
- pop()
- push(…variadic Numbers…)
- shift()
- unshift(…variadic Numbers…)
- reverse() in place reversal
- slice()
- sort()
- splice()
- toSource() returns a string like "ByteArray([])" for a null byte-array.
- valueOf() returns itself
- The + operator returning new ByteArrays
- The mutable [] operator for numbers
String
The String prototype will be extended with the following members:
- toByteArray(codecModuleId)
- toByteString(codecModuleId)
Array
The Array prototype will be extended with the following members:
- toByteArray(codecModuleId)
- toByteString(codecModuleId)
Conventions
"codecModuleId" always defaults to "utf8". Codec modules must always export at least "encode", and "decode" methods to support byte strings and arrays. Digest modules must always export a "hash" method that accepts Array, ByteArray, or ByteString objects as their argument. Compression modules must always export a "compress" method.