Changes

Jump to: navigation, search

User:Jorend/Deterministic hash tables

1,521 bytes added, 16:19, 23 February 2012
Background
= Background =
Most hash table APIs (including C++'s <code>unordered_map</code>, Java's <code>java.util.HashMap</code>, Python's <code>dict</code>, Ruby's <code>Hash</code>, and many others) allow users to iterate over all the entries in the hash table in an arbitrary order. This exposes the user to nondeterminism in an aspect of the implementationlibrary's behavior (the iteration order) that is unspecified and indeed intentionally arbitrary.
<code>Map</code> and <code>Set</code> data structures are proposed for a future version of the ECMAScript programming language. The standard committee would like to avoid underspecification specify deterministic behavior if possible. There are several reasons for this: * There is evidence that some programmers find arbitrary iteration order surprising or confusing at first. [http://stackoverflow.com/questions/2453624/unsort-hashtable][http://stackoverflow.com/questions/1872329/storing-python-dictionary-entries-in-the-order-they-are-pushed][https://groups.google.com/group/comp.lang.python/browse_thread/thread/15f3b4a5cd6221b1/1b6621daf5d78d73][http://bytes.com/topic/c-sharp/answers/439203-hashtable-items-order][http://stackoverflow.com/questions/1419708/how-to-keep-the-order-of-elements-in-hashtable][http://stackoverflow.com/questions/7105540/hashtable-values-reordered]* Property enumeration order is unspecified in ECMAScript, yet all the major implementations have been forced to converge on insertion order, for compatibility with the Web as it is. There is, therefore, some concern that if TC39 does not specify a deterministic iteration order, “the web will just go and specify it for us”.[https://mail.mozilla.org/pipermail/es-discuss/2012-February/020576.html]* Hash table iteration order can expose some bits of object hash codes. This imposes some astonishing security concerns on the hashing function implementer. For example, an object's address must not be recoverable from the exposed bits of its hash code. (Revealing object addresses to untrusted ECMAScript code, while not exploitable by itself, would be a bad security bug on the Web.) Can a data structure retain the performance of traditional, arbitrary-order hash tables while also storing the order in which entries were added, so that iteration order is deterministic?
Tyler Close has developed a nondeterministic hash table that is structured like this (pseudocode):
638
edits

Navigation menu