Talk:Extendible hashing
Appearance
![]() | Computing Unassessed | |||||||||
|
Implementations
I just found that gdbm implements this algorithm. Might be of interest for students. --phil (talk) 20:51, 26 June 2009 (UTC)
Ronald Fagin has his own page here on Wikipedia, how to link it? I ? Unicode (talk) 12:40, 10 September 2009 (UTC)
Sample Code
I think i found a bug in the implementation: What happens if during a split the items of the bucket can't be redistributed (i.e. all items stay in one bucket) because one additional bit is not enough to distinguish between the hashvalues of the key.
Example
I try to add subsequently keys whose hashvalues are (I use the most significant bits). The capacity of my buckets are 1. 100 110 in the beginning everything is empty gd=0 [---] ---> [---] d=0 Adding 100 gd=0 [---] ---> [100] d=0 Adding 110 [---] ---> [100] + 110 d=0 bucket is full => initiate split result of split: [100] d=1 [---] d=1 split according to most significant bit => *boom* all key-values stay in one bucket the right thing to do would be doubling the directory more than once or more algorithmically speaking: double size of directory until at least one value of the bucket to split is moved into the new bucket situation after *one* split and *one* doubling of directory gd=1 [0--] --- > [---] d=1 [1--] --- > [100] + 110 d=1 => double directory and split *again* result of split [100] d=2 [110] d=2 gd=2 [00-] ---> [---] d=1 [01-] -------^ [10-] ---> [100] d=2 [11-] ---> [110] d=2
I'd suggest the following fix. Which is not optimal 'cause it depends on recursion :)
void put(K k, V v) {
Page<K, V> p = getPage(k);
// ensure invariant (p.d < gd)
if (p.full() && p.d == gd) {
List<Page<K, V>> pp2 = new ArrayList<EH2.Page<K, V>>(pp);
pp.addAll(pp2);
++gd;
}
if (p.full()) {
// redistribute values into new page
Page<K, V> p1, p2;
p1 = new Page<K, V>();
p2 = new Page<K, V>();
for (Map.Entry<K, V> entry : p.m.entrySet()) {
K k2 = entry.getKey();
V v2 = entry.getValue();
int h = k2.hashCode() & ((1 << gd) - 1);
if (((h >> p.d) & 1) == 1) {
p2.put(k2, v2);
} else {
p1.put(k2, v2);
}
}
// update directory
for (int i = 0; i < pp.size(); ++i) {
if (pp.get(i) == p) {
if (((i >> p.d) & 1) == 1) {
pp.set(i, p2);
} else {
pp.set(i, p1);
}
}
}
// try to add new key into correct page
int h = k.hashCode();
if (((h >> p.d) & 1) == 1) {
if (!p2.full()) {
p2.put(k, v);
}
} else {
if (!p1.full()) {
p1.put(k, v);
}
}
p1.d = p2.d = p.d + 1;
// if all values went to one page we have to do everything again
if (p1.empty() || p2.empty())
this.put(k, v);
} else {
p.put(k, v);
}
}