Geometry of binary search trees

In computer science, a geometric view of binary search trees (Geometry of BSTs) is a novel approach to the study of binary search tree algorithms, representing them as a set of planar points satisfying a certain property.^[1] In particular, an access sequence x₁, . . . , x_m (sequence of searches you perform on a binary search tree (BST) with a key set {1,2,...,n}) is mapped to the set of points {(x_i, i)}, where X-axis represents key space and Y-axis represents time; to which a set of touched nodes is added. By touched nodes we mean the following. Consider a BST access algorithm with a single pointer to a node in the tree. At the beginning of an access to a given key x_i, this pointer is initialized to the root of the tree. Whenever the pointer moves to or is initialized to a node, we say that the node is touched.^[2] We represent a BST algorithm for a given input sequence by drawing a point for each item that gets touched.

Example

Assume the following BST on 4 nodes is given: The key set is {1, 2, 3, 4}.

Let 3, 1, 4, 2 be the access sequence.

In the first access, only the node 3 is touched.
In the second access, the nodes 3 and 1 are touched.
In the third access - 3 and 4 are touched.
In the fourth access, touch 3, then 1, and after that 2.

The touches are represented geometrically: If an item x is touched at time t, then a point (x, t) is plotted.

BST Model

We consider BST data structures supporting only searches on a set of keys {1, 2, . . . , n}. Moreover, we consider only successful searches, which we call accesses. The input to the data structure is thus a sequence X, called the access sequence, of keys x₁, x₂, . . . , x_m chosen from the set of keys.

The BST access algorithm has a single pointer to a node in the BST. At the beginning of an access to a given key x_i, this pointer is initialized to the root of the tree. The algorithm may then perform any sequence of the following unit-cost operations such that the node containing x_i is eventually the target of the pointer.

Move the pointer to its left child.
Move the pointer to its right child.
Move the pointer to its parent.
Perform a single rotation on the pointer and its parent.

Whenever the pointer moves to or is initialized to a node, we say that the node is touched. The time taken by a BST to execute a sequence X of accesses to keys x₁, x₂, . . . , x_m is the number of unit-cost operations performed, which is at least the number of nodes it touches, which in turn is at least m.

Arborally Satisfied Points Sets

A point set is said to be arborally satisfied if the following property holds: for any pair of points that do not both lie on the same horizontal or vertical line, there exists a third point which lie in the rectangle spanned by the first two points (either inside or on the boundary).

Theorem

A point set containing the points (x_i, i) is arborally satisfied if and only if it corresponds to a valid BST for the input sequence x₁, x₂, . . . , x_m.

Proof

First, we prove that the point set for any valid BST algorithm is arborally satisfied. Consider points (x, i) and (y, j), where x is touched at time i and y is touched at time j. Assume by symmetry that x < y and i < j. We need to show that there exists a third point in the rectangle with corners as (x, i) and (y, j). Also let LCA_t(a, b) denote the lowest common ancestor of nodes a and b right before time t. We have a few cases:

If LCA_i(x, y) ≠ x, then we can use the point (LCA_i(x, y), i), since LCA_i(x, y) must have been touched if x was.
If LCA_j(x, y) ≠ y, then we can use the point (LCA_j(x, y), j).
If neither of the above two cases hold, then we must have x be an ancestor of y right before

time i and y be an ancestor of x right before time j. Then at some time k (i ≤ k < j), y must have been rotated above x, so we can use the point (y, k).

Next, we show the other direction: given an arborally satisfied point set, we can construct a valid BST corresponding to that point set. Now we will organize our BST into a treap which is organized in heap-order by next-touch-time. Note that next-touch-time has ties and is thus not uniquely defined, but this isn’t a problem as long as we pick a way to break ties. When we reach time i, the nodes touched form a connected subtree at the top, by the heap ordering property. We can now take this subtree, assign new next-touch-times and rearrange into a new local treap. Now, if a pair of nodes, x and y, straddle the boundary between the touched and untouched part of the treap, then if y is to be touched sooner than x then (x, now) → (y, next − touch(y)) is an unsatisfied rectangle because the leftmost such point would be the right child of x, not y.

Corollary

Finding the best BST execution for the input sequence x₁, x₂, . . . , x_m is equivalent to finding the minimum cardinality superset of points (that contains the input in geometric representation) that is arborally satisfied.

Greedy Algorithm

The following greedy algorithm constructs arborally satisfiable sets:

Sweep the point set with a horizontal line by increasing y coordinate.
At time i, place the minimal number of points at y = i to make the point set up to y ≤ i arborally satisfied. This minimal set of points is uniquely defined: for any unsatisfied rectangle formed with(x_i, i) in one corner, add the other corner at y = i.

The algorithm is optimal.^[3]

Reverse Transformation

Given geometric view that satisfies the necessary condition of being arborally satisfied, we can reconstruct the exact shape of the tree.

References

^ Demaine, Erik D.; Harmon, Dion; Iacono, John; Kane, Daniel; Pătraşcu, Mihai (2009), "The Geometry of Binary Search Trees" (PDF), in Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2009), New York: 496–505{{citation}}: CS1 maint: extra punctuation (link)
^ Demaine, Erik D.; Harmon, Dion; Iacono, John; Pătraşcu, Mihai (2007), "DYNAMIC OPTIMALITY—ALMOST" (PDF), SIAM Journal on Computing, 37 (1): 240–251{{citation}}: CS1 maint: extra punctuation (link)
^ Fox, Kyle (2011), "Upper Bounds for Maximally Greedy Binary Search Trees" (PDF), WADS'11 Proceedings of the 12th international conference on Algorithms and data structures,: 411–422{{citation}}: CS1 maint: extra punctuation (link)

[1] Demaine, Erik D.; Harmon, Dion; Iacono, John; Kane, Daniel; Pătraşcu, Mihai (2009), "The Geometry of Binary Search Trees" (PDF), in Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2009), New York: 496–505{{citation}}: CS1 maint: extra punctuation (link)

[2] Demaine, Erik D.; Harmon, Dion; Iacono, John; Pătraşcu, Mihai (2007), "DYNAMIC OPTIMALITY—ALMOST" (PDF), SIAM Journal on Computing, 37 (1): 240–251{{citation}}: CS1 maint: extra punctuation (link)

[3] Fox, Kyle (2011), "Upper Bounds for Maximally Greedy Binary Search Trees" (PDF), WADS'11 Proceedings of the 12th international conference on Algorithms and data structures,: 411–422{{citation}}: CS1 maint: extra punctuation (link)

[1]

[2]

[3]