Tamarin:String implementation: Difference between revisions

m
Line 224: Line 224:
* JS_GetStringChars() returns a pointer to UTF-16 characters, and JS_GetStringBytes() returns a pointer to UTF-8 characters. Both buffers are guaranteed to live as long as the string instance lives. SM maintains a separate cache for this purpose, where string buffers are garbage-collected. Other encodings may be requested as well.
* JS_GetStringChars() returns a pointer to UTF-16 characters, and JS_GetStringBytes() returns a pointer to UTF-8 characters. Both buffers are guaranteed to live as long as the string instance lives. SM maintains a separate cache for this purpose, where string buffers are garbage-collected. Other encodings may be requested as well.


=== StringDataUTF8 ===
=== StUTF8String ===


This TT helper class was used to wrap a String instance (which contained UTF-8 data) into a class providing direct access to the string buffer. The new String code offers a stack-based <tt>StUTF8String</tt> containing UTF-8 data and provides access to that data. The pcre code needs this class and another class <tt>StIndexableUTF8String</tt> class, since pcre is UTF-8 based. This leads to a performance slowdown that could be avoided if a regular expression parser was used that worked with UTF-16 data.
This TT helper class was used to wrap a String instance (which contained UTF-8 data) into a class providing direct access to the string buffer. The new String code offers a stack-based <tt>StUTF8String</tt> containing UTF-8 data and provides access to that data. The pcre code needs this class and another class <tt>StIndexableUTF8String</tt> class, since pcre is UTF-8 based. This leads to a performance slowdown that could be avoided if a regular expression parser was used that worked with UTF-16 data.
55

edits