Jump to content

Primary clustering

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 129.97.22.123 (talk) at 18:26, 31 July 2005 (Corrected grammar.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Primary clustering is the tendency for certain collision resolution schemes to create long sequences of filled slots in a hash table of keys.

For example, linear probing has a hash function h(K). N is the number of slots in the hash table. If the position calculated by h(K) is already filled, then the position h(K) + 1 (mod N) is tried. If that position is full, h(K) + 2 (mod N) is tried, and so on. This has a tendency in the long run to create long chains of filled entries. The downside is that looking up an entry can take nearly linear time if there is a lot of clustering.

Note: clusters of filled slots can grow exponentially in some cases. For example, imagine two large clusters of filled slots separated by one empty slot. Placing an entry into this slot will double the cluster size.