PH-tree

PH-tree
PH-tree
Type	tree
Invented	2014
Invented by	Tilmann Zäschke
Operation
Time complexity in big O notation
Operation	Average
Search	O(logMn)
Insert
Space complexity

The PH-tree is a is a tree data structure used for spatial access methods, i.e. for indexing multi-dimensional information such as geographical coordinates, points, feature vectors(), rectangles or [bounding boxes] of arbitray shapes. The structure of a PH-tree is similar to that of a quadtree or octree. However, it uses a different splitting policy that is based on the bit-representation of coordinates/keys. (?) The splitting strategy allows it to scale well with dimensionality $k$/$d$ and size $n$ while providing fast insert/update/remove operations. The bit-level splitting policy also imposes a maximum depth (avoiding degenerated trees). There is no concept of rebalancing.

PH-tree stands for Prefix Hypercube tree. PH-Tree originally stood for PATRICIA Hypercube tree. However the reference to PATRICIA tries (Radix tree) is misleading because it is often understood to be a person's name and because PATRICIA tries actually store characters rather than numbers.

Description

The basic PH-tree can only index $k$/$d$ dimensional vectors or point with integer values. However, it is easy to extend the PH-tree to index floating point coordinates and/or $k$/$d$ dimensional box. Basic Structure A $k$/$d$-dimensional PH-tree is a tree of nodes where each node contains structured in quadrants . The structure partitions space by recursively subdividing it into 2^kd quadrants (see below[] how this scales with high dimensionality). Each quadrant contains at most one entry, either a user vector-value-pair (leaf quadrant) or sub-node. Let’s assume we want to store a key/value pair K/V where K has two dimensions, e.g. (x0, x1). The splitting is determined by the bit-representation of K.

Floating point PH-tree

Region PH-tree

Scalability

Dimensions

In high dimensions with view entries, the PH-Tree may have only a single node, i.e. it “degenerates” into a B-Tree[] with Morton order. Entries

Uses

The fast add/remove operations make it a good candidate for fast changing datasets, especially large ones. One application is therefore gaming[]? where many entities (simulated objects or other players) may need fast or frequent updates.

The node-size is fixed which means that it is mainly suited for in-memory uses. Persistent storage tends to benefit from indexes with configurable node size to align node size with page size on disk. This is, for example, possible with R-Trees[].

Implementations

Java
C++
C++
C++

References

^ https://www2.cs.sfu.ca/CourseCentral/454/jpei/slides/R-Tree.pdf

[1] ttps://www2.cs.sfu.ca/CourseCentral/454/jpei/slides/R-Tree.pdf

[1]