ServerJS/Encodings: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 33: Line 33:
=== Checking for available encodings ===
=== Checking for available encodings ===


  enc.supports(encodingName)
; enc.supports(encodingName)
  enc.listEncodings(encodingCheckerFunction) // encodingCheckerFunctions takes the encoding
: Checks if encodingName is supported and return true if so, false otherwise.
                                            // name as a parameter and returns true-ish if  
enc.listEncodings([encodingCheckerFunction or regex])
                                            // the encoding should be listed
: encodingCheckerFunction takes the encoding name as a parameter and returns true-ish if the encoding should be listed. Regexes should also be supported. If the parameter is missing, returns all supported encodings.


=== Class: Converter ===
=== Class: Converter ===

Revision as of 16:33, 9 April 2009

Rationale

For Streams, we need encodings support. There also should be a low-level API available for this.

There is some discussion on the mailing list (see <http://groups.google.com/group/serverjs/browse_thread/thread/6365b2a54615a134>) and here, there is a summary of these efforts.

Encoding Names

The encoding names should be among those supported by ICONV, which seem to be a superset of http://www.iana.org/assignments/character-sets.

The following encodings are required:

  • US-ASCII
  • UTF-8
  • UTF-16
  • ISO-8859-1

Encoding names must be case insensitive

API

OK, so probably this should be a module:

 var enc = require('encodings')

Simple methods

For convenience, there should be these easy methods for converting between encodings:

 string = enc.convertToString(sourceEncoding, byteStringOrArray)
 byteString = enc.convertFromString(targetEncoding, string)
 byteString = enc.convert(sourceEncoding, targetEncoding, byteStringOrArray)

Checking for available encodings

enc.supports(encodingName)
Checks if encodingName is supported and return true if so, false otherwise.
enc.listEncodings([encodingCheckerFunction or regex])
encodingCheckerFunction takes the encoding name as a parameter and returns true-ish if the encoding should be listed. Regexes should also be supported. If the parameter is missing, returns all supported encodings.

Class: Converter

There also should be a class enc.Converter for more advanced conversion.

[Constructor] Converter(from, to)
Where from and to are the encoding names.
[Method] push(byteStringOrArray)
Convert input from a ByteString or ByteArray. The results are stored in an internal buffer, and also those parts of byteStringOrArray that could not be converted (for multi-byte encodings, in a separate buffer).
Returns nothing.
[Method] get([byteArray,] [maximumSize])
Read maximumSize bytes or as many bytes as available out of the internal buffer. If byteArray is specified, the data is written into that ByteArray.
Returns a ByteString if byteArray is not specified, or byteArray itself otherwise.

Example usage:

 Converter = require('encodings').Converter
 converter = new Converter('iso-8859-1', 'utf-32')
 converter.push(input)
 output = converter.get()