User:Jorend/Deterministic hash tables: Difference between revisions
(→Method) |
|||
Line 45: | Line 45: | ||
[[Image:jorendorff-dht-figure-1.png]] | [[Image:jorendorff-dht-figure-1.png]] | ||
CloseTable is more memory-efficient than DenseTable, because the latter is really wasteful of memory. | |||
[[Image:jorendorff-dht-figure-2.png]] | [[Image:jorendorff-dht-figure-2.png]] | ||
DenseTable and OpenTable must initialize the entire table each time they resize. CloseTable also allocates large chunks of memory, but like a <code>vector</code>, it does not need to write to that memory until there is data that needs to be stored there. | |||
[[Image:jorendorff-dht-InsertSmallTest-speed.png]] | [[Image:jorendorff-dht-InsertSmallTest-speed.png]] | ||
This measures the time required to fill a table to about 100 entries, repeatedly. | |||
[[Image:jorendorff-dht-InsertLargeTest-speed.png]] | [[Image:jorendorff-dht-InsertLargeTest-speed.png]] | ||
This test measures the time required to fill a single gigantic table. | |||
The jagged shape of this graph is robust. It reflects the fact that the table doubles in size as it grows, and rehashing is a significant part of the expense of populating a huge table. | |||
[[Image:jorendorff-dht-LookupHitTest-speed.png]] | [[Image:jorendorff-dht-LookupHitTest-speed.png]] | ||
[[Image:jorendorff-dht-LookupMissTest-speed.png]] | [[Image:jorendorff-dht-LookupMissTest-speed.png]] | ||
In an OpenTable, lookups that miss are slower than lookups that hit ''when there is at least one collision''. This is because we keep probing the hash table until we find an empty entry. | |||
DenseTable is only slightly slower for misses, perhaps because more lookups have zero collisions. (?) |
Revision as of 01:38, 23 February 2012
Abstract
A deterministic hash table proposed by Tyler Close was implemented in C++ and its performance was compared to two hash table implementations using open addressing. Speed and memory usage were measured.
Speed. The Close table implementation was very fast, faster than the open addressing implementations. It is unclear why; theory suggests it "should" be slower, and measurement confirms that the Close table is doing more memory accesses and more branches. More investigation is needed.
Memory. The Close table implementation allocates 29% more memory on average than the leanest open addressing implementation on 32-bit systems, 37% more on 64-bit systems. The implementation can probably trade speed for compactness, but again, more investigation is needed.
Background
Most hash table APIs (including C++'s unordered_map
, Java's java.util.HashMap
, Python's dict
, Ruby's Hash
, and many others) allow users to iterate over all the entries in the hash table in an arbitrary order. This exposes the user to nondeterminism in the implementation.
Map
and Set
data structures are proposed for a future version of the ECMAScript programming language. The standard committee would like to avoid underspecification if possible. Can a data structure retain the performance of traditional, arbitrary-order hash tables while also storing the order in which entries were added, so that iteration is deterministic?
...
Method
The code I used to make these pictures is available at: https://github.com/jorendorff/dht
The project contains two complete hash map implementations: OpenTable and CloseTable. A third implementation, DenseTable, is a thin wrapper around the dense_hash_map
type from Sparsehash. The three classes have the same API and were all benchmarked using the same templates (in hashbench.cpp).
Hash table implementation design notes:
- The API was designed to match the proposal for ECMAScript Map objects as of February 2012.
- The Key and Value types are
uint64_t
because ECMAScript values are 64 bits in the implementation I'm most familiar with.
- OpenTable and CloseTable are meant to be as fast and as memory-efficient as possible. Pretty much everything that can be omitted was omitted. For example, the hashing function is trivial: hash(key) = key, and neither OpenTable nor CloseTable further munges the hashcode before using it as a table index. Rationale: Making each implementation as fast as possible should highlight any performance difference between OpenTable and CloseTable, which is the purpose of the exercise. Using a more sophisticated hashing function would slow down both implementations, reducing the observed difference between the two techniques.
- DenseTable is provided as a baseline. (It's nice to have some realistic numbers in the graphs too.)
- dense_hash_map and OpenTable both implement straightforward hash tables with open addressing. The main difference between the two is one of tuning. dense_hash_map has a maximum load factor of 0.5. OpenTable has a maximum load factor of 0.75, which causes it to use about half as much memory most of the time.
- The purpose of the typedefs KeyArg and ValueArg is to make it possible to switch the API from pass-by-value to pass-by-reference by editing just a couple of lines of code. (I tried this. Pass-by-reference is no faster on 64-bit machines.)
Benchmark design notes:
- There would be more tests but it takes a long time to write them. Feel free to send me pull requests.
- The program runs each benchmark many different times in order to produce enough numbers that noise is visually obvious in the resulting graph. (Most of the speed graphs are nice and smooth.)
Results
CloseTable is more memory-efficient than DenseTable, because the latter is really wasteful of memory.
DenseTable and OpenTable must initialize the entire table each time they resize. CloseTable also allocates large chunks of memory, but like a vector
, it does not need to write to that memory until there is data that needs to be stored there.
This measures the time required to fill a table to about 100 entries, repeatedly.
This test measures the time required to fill a single gigantic table.
The jagged shape of this graph is robust. It reflects the fact that the table doubles in size as it grows, and rehashing is a significant part of the expense of populating a huge table.
In an OpenTable, lookups that miss are slower than lookups that hit when there is at least one collision. This is because we keep probing the hash table until we find an empty entry.
DenseTable is only slightly slower for misses, perhaps because more lookups have zero collisions. (?)