User:Jorend/Deterministic hash tables: Difference between revisions

Jump to navigation Jump to search
Line 18: Line 18:
= Method =
= Method =


The code I used to make these pictures is available at: https://github.com/jorendorff/dht


The project contains two complete hash map implementations: OpenTable and CloseTable. A third implementation, DenseTable, is a thin wrapper around the <code>dense_hash_map</code> type from [https://code.google.com/p/sparsehash/ Sparsehash]. The three classes have the same API and were all benchmarked using the same templates (in hashbench.cpp).


Source code: https://github.com/jorendorff/dht
Design notes:
 
* The API was designed to match the [http://wiki.ecmascript.org/doku.php?id=harmony:simple_maps_and_sets proposal for ECMAScript Map objects] as of February 2012.
 
* The Key and Value types are <code>uint64_t</code> because ECMAScript values are 64 bits in the implementation I'm most familiar with.
 
* OpenTable and CloseTable are meant to be as fast and as memory-efficient as possible. Pretty much everything that can be omitted was omitted. For example, the hashing function is trivial: ''hash(key) = key'', and neither OpenTable nor CloseTable further munges the hashcode before using it as a table index. Rationale: Making each implementation as fast as possible should highlight any performance difference between OpenTable and CloseTable, which is the purpose of the exercise. Using a more sophisticated hashing function would slow down both implementations, reducing the observed difference between the two techniques.
 
* DenseTable is provided as a baseline. (It's nice to have some realistic numbers in the graphs too.)
 
* dense_hash_map and OpenTable both implement straightforward hash tables with open addressing. The main difference between the two is one of tuning. dense_hash_map has a maximum load factor of 0.5. OpenTable has a maximum load factor of 0.75, which causes it to use about half as much memory most of the time.
 
* The purpose of the typedefs KeyArg and ValueArg is to make it possible to switch the API from pass-by-value to pass-by-reference by editing just a couple of lines of code. (I tried this. Pass-by-reference is no faster on 64-bit machines.)


= Results =
= Results =
638

edits

Navigation menu