Universal hash function

In computer science, a universal hash function is a theoretical construct primarily used to show that an algorithm based on a hash function cannot be forced to have bad performance by an adversary. Bad performance in hashing comes from collisions, and a universal hash function guarantees that these cannot be forced to occur too often.

The formal definition involves a set of keys, K, a set of values, V, and a family of hash functions, H, mapping keys to values. Let |V| denote size of V, the number of possible values. Then for all pairs of distinct keys x and y in K, H is a 2-universal family of hash functions if

\Pr _{h\in H}[h(x)=h(y)]\leq {\frac {1}{|V|}}.

More stringently, H is a strongly 2-universal family if for all pairs of distinct keys x and y in K, and for all pairs x′ and y′ in V,

\Pr _{h\in H}[h(x)=x'{\mbox{ and }}h(y)=y']={\frac {1}{|V|^{2}}}.

Without qualification, the latter definition is probably intended.

Example

Let the keys, K, be {0,1,…,p−1} for some prime, p, and let the values, V, be {0,1,…n−1}. Define a 2-parameter family of functions, h_a,b, as

h_{a,b}(x)=((ax+b)\mod p)\mod n,\,\!

H=\{h_{a,b}:a\in \{1,\ldots ,p-1\},b\in \{0,\ldots ,p-1\}\}.\,\!

References

{{cite book

  | last = Knuth
  | first = Donald Ervin
  | authorlink = Donald Ervin Knuth
  | title = [[The Art of Computer Programming], Vol. III: Sorting and Searching]
  | edition = 2e
  | year = 1998
  | publisher = Addison-Wesley
  | location = Reading, Mass ; London
  | notes = v. 1. Fundamental algorithms -- v. 2. Seminumerical algorithms -- v. 3. Sorting and searching.
  | id = ISBN 0-201-89685-0
 }}