Random binary tree

In computer science and probability theory, a random binary tree is a binary tree selected at random from some probability distribution on binary trees. Different distributions have been used, leading to different properties for these trees.

Random binary trees have been used for analyzing the average-case complexity of data structures based on binary search trees. For this application it is common to use random trees formed by inserting nodes one at a time according to a random permutation. Adding and removing nodes directly in a random binary tree will in general disrupt its random structure, but the treap and related randomized binary search tree data structures use the principle of binary trees formed from a random permutation in order to maintain a balanced binary search tree dynamically as nodes are inserted and deleted.

Other distributions on random binary trees include the uniform discrete distribution in which all distinct trees are equally likely, distributions on a given number of nodes obtained by repeated splitting, and trees generated by Galton–Watson processes, for which (unlike the other models) the number of nodes in the tree is not fixed.

For random trees that are not necessarily binary, see random tree.

Background

A binary tree is a rooted tree in which each node may have up to two children (the nodes directly below it in the tree), and those children are designated as being either left or right. It is sometimes convenient instead to consider extended binary trees in which each node is either an external node with zero children, or an internal node with exactly two children. A binary tree that is not in extended form may be converted into an extended binary tree, with all of the original nodes converted into the internal nodes of the extended binary tree, by adding additional external nodes as children of the nodes of the given tree, so that after this addition the internal nodes all have exactly two children. In the other direction, an extended binary tree with at least one internal node may be converted back into a non-extended binary tree by removing all its external nodes. In this way, these two forms are almost entirely equivalent for the purposes of mathematical analysis, except that the extended form allows a tree consisting of a single external node, which does not correspond to anything in the non-extended form. For the purposes of computer data structures, the two forms differ, as the external nodes of the first form may be represented explicitly as objects in a data structure.^[1]

When the internal nodes of a binary tree are labeled by ordered keys of some type (such as distinct numbers), and the inorder traversal of the tree produces the sorted sequence of keys, the result is a binary search tree. Generally, the external nodes of such a tree remain unlabeled.^[2] Binary trees may also be studied with all nodes unlabeled, or with labels that are not given in sorted order. For instance, the Cartesian tree data structure uses labeled binary trees that are not necessarily binary search trees.^[3]

A random binary tree is a random tree drawn from a certain probability distribution on binary trees. In many cases, these probability distributions are defined using a given set of keys, and describe the probabilities of binary search trees having those keys. However, other distributions are possible, not necessarily generating binary search trees, and not necessarily giving a fixed number of nodes.^[4]

From random permutations

Binary tree generated from 100-element random permutation

For any set of numbers (or, more generally, values from some total order), one may form a binary search tree in which each number is inserted in sequence as a leaf of the tree, without changing the structure of the previously inserted numbers. The position into which each number should be inserted is uniquely determined by a binary search in the tree formed by the previous numbers. In the random permutation model of random binary trees, each of these permutations is equally likely.^[5]

For instance, if the three numbers (1,3,2) are inserted into a binary search tree in that sequence, the number 1 will sit at the root of the tree, the number 3 will be placed as its right child, and the number 2 as the left child of the number 3. There are six different permutations of the numbers (1,2,3), but only five trees may be constructed from them. That is because the permutations (2,1,3) and (2,3,1) form the same tree. Thus, this tree has probability ${\tfrac {2}{6}}={\tfrac {1}{3}}$ of being generated, whereas the other four trees each have probability ${\tfrac {1}{6}}$ .^[4]

Expected depth of a node

For any fixed choice of a value $x$ in a given set of $n$ numbers, if one randomly permutes the numbers and forms a binary tree from them as described above, the expected value of the length of the path from the root of the tree to $x$ is at most $2\log n+O(1)$ , where " $\log$ " denotes the natural logarithm function and the $O$ introduces big O notation. This follows because the expected number of ancestors of $x$ is by linearity of expectation equal to the sum, over all other values $y$ in the set, of the probability that $y$ is an ancestor of $x$ . And a value $y$ is an ancestor of $x$ exactly when $y$ is the first element to be inserted from the elements in the interval $[x,y]$ . Thus, the values that are adjacent to $x$ in the sorted sequence of values have probability ${\tfrac {1}{2}}$ of being an ancestor of $x$ , the values one step away have probability ${\tfrac {1}{3}}$ , etc. Adding these probabilities for all positions in the sorted sequence gives twice a Harmonic number, leading to the bound above. A bound of this form holds also for the expected search length of a path to a fixed value $x$ that is not part of the given set.^[6]

The longest path

Although not as easy to analyze as the average path length, there has also been much research on determining the expectation (or high probability bounds) of the length of the longest path in a binary search tree generated from a random insertion order. This length, for a tree with $n$ nodes, is almost surely

\displaystyle {\frac {1}{\beta }}\log n\approx 4.311\log n,

where $\beta$ is the unique number in the range $0<\beta <1$ satisfying the equation

\displaystyle 2\beta e^{1-\beta }=1.

^[7]

Expected number of leaves

In the random permutation model, each of the numbers from the set of numbers used to form the tree, except for the smallest and largest of the numbers, has probability ${\tfrac {1}{3}}$ of being a leaf in the tree, because it is a leaf when it inserted after its two neighbors, and any of the six permutations of these two neighbors and it are equally likely. By similar reasoning, the smallest and largest of the numbers have probability ${\tfrac {1}{2}}$ of being a leaf. Therefore, the expected number of leaves is the sum of these probabilities, which for $n\geq 2$ is exactly $(n+1)/3$ .^[8]

Strahler Number

The Strahler number of a tree is a more sensitive measure of the distance from a leaf in which a node has Strahler number $i$ whenever it has either a child with that number or two children with number $i-1$ . For $n$ -node random binary search trees, simulations suggest that the expected Strahler number is $\log _{3}n+o(\log n)$ . However, only the upper bound $\log _{3}n+O(1)$ has actually been proven.^[9]

Treaps and randomized binary search trees

In applications of binary search tree data structures, it is rare for the values in the tree to be inserted without deletion in a random order, limiting the direct applications of random binary trees. However, algorithm designers have devised data structures that allow insertions and deletions to be performed in a binary search tree, at each step maintaining as an invariant the property that the shape of the tree is a random variable with the same distribution as a random binary search tree.^[10]

If a given set of ordered numbers is assigned numeric priorities (distinct numbers unrelated to their values), these priorities may be used to construct a Cartesian tree for the numbers, a binary tree that has as its inorder traversal sequence the sorted sequence of the numbers and that is heap-ordered by priorities. Although more efficient construction algorithms are known, it is helpful to think of a Cartesian tree as being constructed by inserting the given numbers into a binary search tree in priority order. Thus, by choosing the priorities either to be a set of independent random real numbers in the unit interval, or by choosing them to be a random permutation of the numbers from $1$ to $n$ (where $n$ is the number of nodes in the tree), and by maintaining the heap ordering property using tree rotations after any insertion or deletion of a node, it is possible to maintain a data structure that behaves like a random binary search tree. Such a data structure is known as a treap or a randomized binary search tree.^[10]

Variants of the treap including the zip tree and zip-zip tree replace the tree rotations by "zipping" operations that split and merge trees, and that limit the number of random bits that need to be generated and stored alongside the keys. The result of these optimizations is still a tree with a random structure, but one that does not exactly match the random permutation model.^[11]

Uniformly random binary trees

The number of binary trees with $n$ nodes (or extended binary trees with $n$ internal nodes and $n+1$ external nodes) is a Catalan number. For $n=1,2,3,\dots$ these numbers of trees are

1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, ... (sequence A000108 in the OEIS).

Thus, if one of these trees is selected uniformly at random, its probability is the reciprocal of a Catalan number. Trees generated from a model in this distribution are sometimes called random binary Catalan trees.^[12] They have expected depth proportional to the square root of $n$ , rather than to the logarithm.^[13] More precisely, the expected depth of a randomly chosen node in an $n$ -node tree of this type is

{\sqrt {\pi n}}-3+O\left({\frac {1}{\sqrt {n}}}\right)

.^[14]

The expected Strahler number of a uniformly random $n$ -node binary tree is $\log _{4}n+O(1)$ , lower than the expected Strahler number of random binary search trees.^[15] In some cases the analysis of random binary trees under the random permutation model can be automatically transferred to the uniform model.^[16]

Due to their large heights, this model of equiprobable random trees is not generally used for binary search trees. However, it has other applications, including:

Modeling the parse trees of algebraic expressions in compiler design.^[17] Here the internal nodes of the tree represent binary operations in an expression and the external nodes represent the variables or constants on which the expressions operate. The bound on Strahler number translates into the number of registers needed to evaluate an expression.^[18]
Modeling river networks, the original application for which the Strahler number was developed.^[19]
Modeling possible evolutionary trees for a fixed number of species. In this application, an extended binary tree is used, with the species at its external nodes.^[20]

Galton–Watson process

The Galton–Watson process describes a family of distributions on trees in which the number of children at each node is chosen randomly, independently of other nodes. For binary trees, two versions of the Galton–Watson process are in use, differing only in whether an extended binary tree with only one node, an external root node, is allowed:

In the version where the root node may be external, it is chosen to be internal with some specified probability $p$ or external with probability $1-p$ . If it is internal, its two children are trees generated recursively by the same process.
In the version where the root node must be internal, its left and right children are determined to be internal with probability $p$ or external with probability $1-p$ , independently of each other. In the case where they are internal, they are the roots of trees that are generated recursively by the same process.

Trees generated in this way have been called binary Galton–Watson trees. In the special case where $p={\tfrac {1}{2}}$ they are called critical binary Galton–Watson trees.^[21] This probability marks a phase transition for the binary Galton–Watson process: for $p\leq {\tfrac {1}{2}}$ the resulting tree is almost certainly finite, whereas for $p>{\tfrac {1}{2}}$ it is infinite with positive probability. More precisely, for any $p$ , the probability that the tree remains finite is

\displaystyle \min \left\{1,{\frac {1-p}{p}}\right\}

.^[22]

Another way to generate the same trees is to make a sequence of coin flips, with probability $p$ of heads and probability $1-p$ of tails, until the first flip at which the number of tails exceeds the number of heads (for the model in which an external root is allowed) or exceeds one plus the number of heads (when the root must be internal), and then use this sequence of coin flips to determine the choices made by the recursive generation process, in depth-first order.^[23] Because the number of internal nodes equals the number of heads in this coin flip sequence, all trees with a given number $n$ of nodes are generated from (unique) coin flip sequences of the same length, and are equally likely, regardless of $p$ . That is, the choice of $p$ affects the variation in the size of trees generated by this process, but for a given size the trees are generated uniformly at random.^[24] For values of $p$ below the critical probability $p={\tfrac {1}{2}}$ , smaller values of $p$ will produce trees with a smaller expected size, while larger values of $p$ will produce trees with a larger expected size. At the critical probability $p={\tfrac {1}{2}}$ there is no finite bound on the expected size of trees generated by this process.

Galton–Watson processes were originally developed to study the spread and extinction of human surnames, and have been widely applied more generally to the dynamics of human or animal populations. It has been generalized to models where the probability of being an internal or external node at a given level of the tree (a generation, in the population dynamics application) is not fixed, but depends on the number of nodes at the previous level.^[25] A version of this process, with the critical probability ${\tfrac {1}{2}}$ , has been studied as a model for speciation, where it is known as the critical branching process. In this process, each species has an exponentially distributed lifetime, and over the course of its lifetime produces child species at a rate equal to the lifetime. When a child is produced, the parent continues as the left branch of the evolutionary tree, and the child becomes the right branch.^[26] Another application of critical Galton–Watson trees (in the version where the root must be internal) arises in the Karger–Stein algorithm for finding minimum cuts in graphs, using a recursive edge contraction process. This algorithm calls itself twice recursively, with each call having probability at least ${\tfrac {1}{2}}$ of preserving the correct solution value. The random tree models the subtree of correct recursive calls. The algorithm succeeds on a graph of $n$ vertices whenever this random tree of correct recursive calls has a branch of depth at least $2\log _{2}n$ , reaching the base case of its recursion. The success probability is $\Omega (1/\log n)$ , producing one of the logarithmic factors in the algorithm's $O(n^{2}\log ^{3}n)$ runtime.^[27]

Devroye and Robson consider a related continuous-time random process in which each external node is eventually replaced by an internal node with two external children, at an exponentially distributed time after its first appearance as an external node. The number of external nodes in the tree, at any time, is modeled by a simple birth process or Yule process in which the members of a population give birth at a constant rate: giving birth to one child, in the Yule process, corresponds to being replaced by two children, in Devroye and Robson's model. If this process is stopped at any fixed time, the result is a binary tree of a random size (depending on the stopping time), distributed according to the random permutation model for that size. Devroye and Robson use this model as part of an algorithm to quickly generate trees in the random permutation model, described by their numbers of nodes at each depth rather than by their exact structure.^[28]

Binary tries

Another form of binary tree, the binary trie or digital search tree, has a collection of binary numbers labeling some of its external nodes. The internal nodes of the tree represent prefixes of their binary representations that are shared by two or more of the numbers. The left and right children of an internal node are obtained by extending the corresponding prefix by one more bit, a zero or a one bit respectively. If this extension does not match any of the given numbers, or it matches only one of them, the result is an external node; otherwise it is another internal node. Random binary tries have been studied, for instance for sets of random real numbers generated independently in the unit interval. Despite the fact that these trees may have some empty external nodes, they tend to be more balanced than random binary search trees. More precisely, for $n$ uniformly random real numbers in the unit interval, or more generally for any square-integrable probability distribution on the unit interval, the average depth of a node is asymptotically $\log _{2}n$ , and the average height of the whole tree is asymptotically $2\log _{2}n$ . The analysis of these trees can be applied to the computational complexity of trie-based sorting algorithms.^[29]

A variant of the trie, the radix tree or compressed trie, eliminates empty external nodes and their parent internal nodes. The remaining internal nodes correspond to prefixes for which both possible extensions, by a zero or a one bit, are used by at least one of the randomly chosen numbers. For a radix tree for $n$ uniformly distributed binary numbers, the shortest leaf-root path has length $\log _{2}n-\log _{2}\log n+o(\log \log n)$ and the longest leaf-root path has length $\log _{2}n+{\sqrt {2\log _{2}n}}+o({\sqrt {\log n}}),$ both with high probability.^[30]

Random split trees

Luc Devroye and Paul Kruszewski describe a recursive process for constructing random binary trees with $n$ nodes. It generates a real-valued random variable $x$ in the unit interval $(0,1)$ , assigns the first $xn$ nodes (rounded down to an integer number of nodes) to the left subtree, the next node to the root, and the remaining nodes to the right subtree. Then, it continues recursively using the same process in the left and right subtrees. If $x$ is chosen uniformly at random in the interval, the result is the same as the random binary search tree generated by a random permutation of the nodes, as any node is equally likely to be chosen as root. However, this formulation allows other distributions to be used instead. For instance, in the uniformly random binary tree model, once a root is fixed each of its two subtrees must also be uniformly random, so the uniformly random model may also be generated by a different choice of distribution (depending on $n$ ) for $x$ . As they show, by choosing a beta distribution on $x$ and by using an appropriate choice of shape to draw each of the branches, the mathematical trees generated by this process can be used to create realistic-looking botanical trees.^[31]

Notes

^ Knuth (1997).
^ Knuth (1973).
^ Vuillemin (1980).
^ ^a ^b Sedgewick & Flajolet (2013), p. 286.
^ Morin (2014).
^ Hibbard (1962); Knuth (1973); Mahmoud (1992), p. 75.
^ Robson (1979); Pittel (1985); Devroye (1986); Mahmoud (1992), pp. 91–99; Reed (2003).
^ Brown & Shubert (1984).
^ Kruszewski (1999).
^ ^a ^b Martínez & Roura (1998); Seidel & Aragon (1996); Morin (2014).
^ Tarjan, Levy & Timmel (2021); Gila, Goodrich & Tarjan (2023).
^ Sedgewick & Flajolet (2013), p. 287.
^ Knuth (2005), p. 15.
^ Sedgewick & Flajolet (2013), p. 288.
^ Devroye & Kruszewski (1995).
^ Mahmoud (1992), p. 70.
^ Mahmoud (1992), p. 63.
^ Flajolet, Raoult & Vuillemin (1979).
^ Shreve (1966).
^ Aldous (1996).
^ Burd, Waymire & Winn (2000).
^ This is a special case of a general theorem about criticality and extinction probabilities in Galton–Watson processes, according to which the extinction probability is the smallest positive root of the formula $g(r)=r$ , where $g$ is the probability-generating function of the distribution on the number of children, here $g(x)=(1-p)+px^{2}$ . See e.g. Jagers (2011), Theorem 2.1, p. 92. Jagers carries out the calculation of this root for the binary case on p. 97.
^ For the connection between trees and random walks (as generated by random coin flips) see e.g. Section 6, "Walks and trees" pp. 483–486, of Harris (1952).
^ Broutin, Devroye & Fraiman (2020). More generally, every Galton–Watson process, conditioned on producing trees of a certain size, produces the same probability distribution as a critical Galton–Watson process: see section 2 of Kennedy (1975).
^ Jagers (2011).
^ Popovic (2004).
^ Karger & Stein (1996).
^ Devroye & Robson (1995).
^ Devroye (1984).
^ Devroye (1992).
^ Devroye & Kruszewski (1996).

References

Aldous, David (1996), "Probability distributions on cladograms", in Aldous, David; Pemantle, Robin (eds.), Random Discrete Structures, The IMA Volumes in Mathematics and its Applications, vol. 76, Springer-Verlag, pp. 1–18, doi:10.1007/978-1-4612-0719-1_1
Brown, Gerald G.; Shubert, Bruno O. (1984), "On random binary trees", Mathematics of Operations Research, 9: 43–65, doi:10.1287/moor.9.1.43
Broutin, Nicolas; Devroye, Luc; Fraiman, Nicolas (April 2020), "Recursive functions on conditional Galton–Watson trees" (PDF), Random Structures & Algorithms, 57 (2), Wiley: 304–316, doi:10.1002/rsa.20921
Burd, Gregory A.; Waymire, Edward C.; Winn, Ronald D. (February 2000), "A self-similar invariance of critical binary Galton–Watson trees", Bernoulli, 6 (1): 1–21, JSTOR 3318630
Devroye, Luc (1984), "A probabilistic analysis of the height of tries and of the complexity of triesort" (PDF), Acta Informatica, 21: 229–237, doi:10.1007/BF00264248
Devroye, Luc (1986), "A note on the height of binary search trees" (PDF), Journal of the ACM, 33 (3): 489–498, doi:10.1145/5925.5930
Devroye, Luc (January 1992), "A note on the probabilistic analysis of patricia trees" (PDF), Random Structures & Algorithms, 3 (2): 203–214, doi:10.1002/rsa.3240030209
Devroye, Luc; Kruszewski, Paul (1995), "A note on the Horton–Strahler number for random trees" (PDF), Information Processing Letters, 56 (2): 95–99, doi:10.1016/0020-0190(95)00114-R
Devroye, Luc; Kruszewski, Paul (1996), "The botanical beauty of random binary trees" (PDF), in Brandenburg, Franz J. (ed.), Graph Drawing: 3rd Int. Symp., GD'95, Passau, Germany, September 20–22, 1995, Lecture Notes in Computer Science, vol. 1027, Springer-Verlag, pp. 166–177, doi:10.1007/BFb0021801, ISBN 978-3-540-60723-6
Devroye, Luc; Robson, John Michael (December 1995), "On the generation of random binary aearch trees" (PDF), SIAM Journal on Computing, 24 (6): 1141–1156, doi:10.1137/s0097539792224954
Flajolet, P.; Raoult, J. C.; Vuillemin, J. (1979), "The number of registers required for evaluating arithmetic expressions" (PDF), Theoretical Computer Science, 9 (1): 99–125, doi:10.1016/0304-3975(79)90009-4
Gila, Ofek; Goodrich, Michael T.; Tarjan, Robert E. (2023), "Zip-zip trees: making zip trees more balanced, biased, compact, or persistent", in Morin, Pat; Suri, Subhash (eds.), Algorithms and Data Structures – 18th International Symposium, WADS 2023, Montreal, QC, Canada, July 31 – August 2, 2023, Proceedings, Lecture Notes in Computer Science, vol. 14079, Springer, pp. 474–492, arXiv:2307.07660, doi:10.1007/978-3-031-38906-1_31
Harris, T. E. (1952), "First passage and recurrence distributions", Transactions of the American Mathematical Society, 73 (3): 471–486, doi:10.1090/s0002-9947-1952-0052057-2
Hibbard, Thomas N. (1962), "Some combinatorial properties of certain trees with applications to searching and sorting", Journal of the ACM, 9 (1): 13–28, doi:10.1145/321105.321108
Jagers, Peter (2011), "Extinction, persistence, and evolution", in Chalub, Fabio A. C. C.; Rodrigues, José Francisco (eds.), The Mathematics of Darwin's Legacy, Mathematics and Biosciences in Interaction, Basel: Birkhäuser, pp. 91–104, doi:10.1007/978-3-0348-0122-5_5, ISBN 9783034801225
Karger, David R.; Stein, Clifford (1996), "A new approach to the minimum cut problem" (PDF), Journal of the ACM, 43 (4): 601, doi:10.1145/234533.234534
Kennedy, Douglas P. (1975), "The Galton–Watson process conditioned on the total progeny", Journal of Applied Probability, 12: 800–806, doi:10.2307/3212730
Knuth, Donald E. (1973), "6.2.2 Binary Tree Searching", The Art of Computer Programming, Vol. III: Sorting and Searching, Addison-Wesley, pp. 422–451
Knuth, Donald E. (1997), "2.3.4.5 Path Length", The Art of Computer Programming, Vol. I: Seminumerical Algorithms (3rd ed.), Addison-Wesley, pp. 399–406
Knuth, Donald E. (2005), "Draft of Section 7.2.1.6: Generating All Trees", The Art of Computer Programming, vol. IV
Kruszewski, Paul (1999), "A note on the Horton-Strahler number for random binary search trees", Information Processing Letters, 69 (1): 47–51, doi:10.1016/S0020-0190(98)00192-6
Mahmoud, Hosam M. (1992), Evolution of Random Search Trees, John Wiley & Sons
Martínez, Conrado; Roura, Salvador (1998), "Randomized binary search trees", Journal of the ACM, 45 (2): 288–323, CiteSeerX 10.1.1.17.243, doi:10.1145/274787.274812
Morin, Pat (March 22, 2014), "Chapter 7: Random Binary Search Trees", Open Data Structures (in pseudocode) (PDF) (0.1Gβ ed.), pp. 145–164
Pittel, B. (1985), "Asymptotical growth of a class of random trees", Annals of Probability, 13 (2): 414–427, doi:10.1214/aop/1176993000
Popovic, Lea (November 2004), "Asymptotic genealogy of a critical branching process", Annals of Applied Probability, 14 (4), doi:10.1214/105051604000000486
Reed, Bruce (2003), "The height of a random binary search tree", Journal of the ACM, 50 (3): 306–332, doi:10.1145/765568.765571
Robson, J. M. (1979), "The height of binary search trees", Australian Computer Journal, 11: 151–153
Seidel, Raimund; Aragon, Cecilia R. (1996), "Randomized search trees", Algorithmica, 16 (4–5): 464–497, doi:10.1007/s004539900061
Sedgewick, Robert; Flajolet, Philippe (2013), "Chapter 6: Trees", An Introduction to the Analysis of Algorithms (2nd ed.), Addison-Wesley, ISBN 9780133373486
Shreve, Ronald L. (January 1966), "Statistical law of stream numbers", The Journal of Geology, 74 (1): 17–37, JSTOR 30075174
Tarjan, Robert E.; Levy, Caleb C.; Timmel, Stephen (2021), "Zip trees", ACM Transactions on Algorithms, 17 (4): 34:1–34:12, doi:10.1145/3476830
Vuillemin, Jean (1980), "A unifying look at data structures", Communications of the ACM, 23 (4): 229–239, doi:10.1145/358841.358852