Distance oracle

In computing, a distance oracle is a data structure for calculating distances between vertices in a graph.

Introduction

Let G(V,E) be an undirected, weighted graph, with n=|V| nodes and m=|E| edges. We would like to answer queries of the form "what is the distance between the nodes s and t?".

One way to do this is just run the Dijkstra algorithm. This takes time $O(m+n\log n)$ , and requires no extra space (besides the graph itself).

In order to answer many queries more efficiently, we can spend some time in pre-processing the graph and creating an auxiliary data structure.

A simple data structure that achieves this goal is a matrix which specifies, for each pair of nodes, the distance between them. This structure allows us to answer queries in constant time $O(1)$ , but requires $O(n^{2})$ extra space. It can be initialized in time $O(n^{3})$ using an all-pairs shortest paths algorithm, such as the Floyd–Warshall algorithm.

A distance oracle lies between these two extremes. It uses less than $O(n^{2})$ space in order to answer queries in less than $O(m+n\log n)$ time. Most distance oracles have to compromise on accuracy, i.e. they don't return the accurate distance but rather a constant-factor approximation of it.

Approximate distance oracle

^[1] describe more than 10 different distance oracles. They then suggest a new distance oracle that, for every k, requires space $O(kn^{1+1/k})$ , such that any subsequent distance query can be approximately answered in time $O(k)$ . The approximate distance returned is of stretch at most $2k-1$ , that is, the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and $2k-1$ . The initialization time is $O(kmn^{1/k})$ .

Some special cases include:

For $k=1$ we get the simple distance matrix.
For $k=2$ we get a structure using $O(n^{1.5})$ space which answers each query in constant time and approximation factor at most 3.
For $k=\lfloor \log n\rfloor$ , we get a structure using $O(n\log n)$ space, query time $O(logn)$ , and stretch $O(\log n)$ .

Higher values of k do not improve the space or preprocessing time.

Oracle for general metric spaces

The oracle is built of a decreasing collection of k+1 sets of vertices:

$A_{0}=V$
For every $i=1,...,k-1$ : $A_{i}$ contains each element of $A_{i-1}$ , independently, with probability $n^{-1/k}$ . Note that the expected size of $A_{i}$ is $n^{1-i/k}$ . The elements of $A_{i}$ are called i-centers.
$A_{k}=\emptyset$

For every node v, we calculate its distance from each of these sets:

For every $i=0,...,k-1$ : $d(A_{i},v)=\min {(d(w,v)|w\in A_{i}}$ and $p_{i}(v)=\arg \min {(d(w,v)|w\in A_{i}}$ . I.e., $p_{i}(v)$ is the i-center nearest to v, and $d(A_{i},v)$ is the distance between them. Note that for a fixed v, this distance is weakly increasing with i. Also note that for every v, $d(A_{0},v)=0andp_{0}(v)=v$ .
$d(A_{k},v)=\infty$ .

For every v, compute its bunch:

$B(v)=\cup _{i=0}^{k-1}\{w\in A_{i}\setminus A_{i+1}|d(w,v)<d(A_{i+1},v)\}$

The bunch of v contains all vertices in $A_{i}$ which are strictly closer to v than all vertices in $A_{i+1}$ . It is possible to show that the expected size of $B(v)$ is at most $kn^{1/k}$ .

For every bunch $B(v)$ , construct a hash table that holds, for every $w\in B(V)$ , the distance $d(w,v)$ .

The total size of the data structure is $O(kn+\Sigma |B(v)|)=O(kn+nkn^{1/k})=O(kn^{1+1/k})$

Having this structure initialized, the following algorithm finds the distance between two nodes, u and v:

$w:=u,i:=0$
while $w\notin B(v)$ $w\notin B(v)$ :
- $i:=i+1$
- $(u,v):=(v,u)$
- $w:=p_{i}(u)$
return $d(w,u)+d(w,v)$

Improvements

Their result was improved by ^[2] who suggest a distance oracle of size $O(n^{4/3}m^{1/3})$ which returns a factor 2 approximation.

References

^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1044731.1044732, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1044731.1044732 instead.
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1109/FOCS.2010.83, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1109/FOCS.2010.83 instead.

[1] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1044731.1044732, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1044731.1044732 instead.

[2] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1109/FOCS.2010.83, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1109/FOCS.2010.83 instead.

[1]

[2]