Changes

Jump to: navigation, search

User:Jorend/Deterministic hash tables

1,164 bytes added, 21:07, 27 February 2012
Theory
'''Speed.''' We consider the speed of lookups only. The <code>get</code>, <code>has</code>, <code>set</code>, and <code>delete</code> operations all start with a lookup. We will quantify the speed of lookups in terms of how many memory locations must be accessed. To simulate locality effects, we treat multiple reads from a single entry, or from the table object, as a single read.
 
<table class="data">
<tr>
<th colspan=3>Number of collisions</th>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
</tr>
<tr>
<th rowspan=4>Number of reads</th>
<th rowspan=2>Unsuccessful lookup (miss)</th>
<th><code>OpenTable</code></th>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<th><code>CloseTable</code></th>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<th rowspan=2>Successful lookup (hit)</th>
<th><code>OpenTable</code></th>
<td>2</td>
<td>2 to 3</td>
<td>2 to 4</td>
</tr>
<tr>
<th><code>CloseTable</code></th>
<td>3</td>
<td>3 to 4</td>
<td>3 to 5</td>
</tr>
</table>
In an open addressing hash table with fill factor ''f'', an unsuccessful lookup performs about ''1/(1-f)'' probes.[https://en.wikipedia.org/wiki/Hash_table#Performance_analysis] In an <code>OpenTable</code>, at worst ''f=F=3/4'' and so an unsuccessful lookup performs 4 probes on average. In the usual case, ''f=3F/4=9/16'', so about 2.3 probes on average. A successful lookup performs half as many probes on average. Before probing the table, the lookup must access the <code>OpenTable</code> struct itself, so the final numbers are as shown in the table: 5 reads, 3.3 reads, etc.
A Close table uses direct chaining. Each lookup must read first the <code>CloseTable</code> object itself, and next the <code>hashTable</code>. Then, for an unsuccessful lookup, the whole chain is walked; the expected length of the chain is simply the number of keys in the table divided by the number of hash buckets. For a <code>CloseTable</code>, this quotient is at most 8/3 and typically 2. A successful lookup would perform about half as many probes, if keys were distributed evenly, but instead they are distributed randomly, making successful lookups slower; by simulation, I found the expected number of probes is 2..3 under full load and 2 under typical load.
<table class="data">
<td>4.7
<td>4
<td>-4.3 <td>-4
</tr>
</table>
638
edits

Navigation menu