Jump to content

PH-tree

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by TilmannZ (talk | contribs) at 16:33, 8 February 2022 (Initial). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
PH-tree
Typetree
Invented2014
Invented byTilmann Zäschke
Time complexity in big O notation
Operation Average Worst case
Search O(logMn) O(n)[1]
Insert O(n)
Space complexity

The PH-tree is a is a tree data structure used for spatial access methods, i.e. for indexing multi-dimensional information such as geographical coordinates, points, feature vectors(), rectangles or [bounding boxes] of arbitray shapes. The structure of a PH-tree is similar to that of a quadtree or octree. However, it uses a different splitting policy that is based on the bit-representation of coordinates/keys. (?) The splitting strategy allows it to scale well with dimensionality $k$/$d$ and size $n$ while providing fast insert/update/remove operations. The bit-level splitting policy also imposes a maximum depth (avoiding degenerated trees). There is no concept of rebalancing.


PH-tree stands for Prefix Hypercube tree. PH-Tree originally stood for PATRICIA Hypercube tree. However the reference to PATRICIA tries (Radix tree) is misleading because it is often understood to be a person's name and because PATRICIA tries actually store characters rather than numbers.


Description

The basic PH-tree can only index $k$/$d$ dimensional vectors or point with integer values. However, it is easy to extend the PH-tree to index floating point coordinates and/or $k$/$d$ dimensional box. Basic Structure A $k$/$d$-dimensional PH-tree is a tree of nodes where each node contains structured in quadrants . The structure partitions space by recursively subdividing it into 2^kd quadrants (see below[] how this scales with high dimensionality). Each quadrant contains at most one entry, either a user vector-value-pair (leaf quadrant) or sub-node. Let’s assume we want to store a key/value pair K/V where K has two dimensions, e.g. (x0, x1). The splitting is determined by the bit-representation of K.


Floating point PH-tree

Region PH-tree

Scalability

Dimensions

In high dimensions with view entries, the PH-Tree may have only a single node, i.e. it “degenerates” into a B-Tree[] with Morton order. Entries


Uses

The fast add/remove operations make it a good candidate for fast changing datasets, especially large ones. One application is therefore gaming[]? where many entities (simulated objects or other players) may need fast or frequent updates.

The node-size is fixed which means that it is mainly suited for in-memory uses. Persistent storage tends to benefit from indexes with configurable node size to align node size with page size on disk. This is, for example, possible with R-Trees[].

Implementations

  • Java
  • C++
  • C++
  • C++



References