Jump to content

Tree sort

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 163.158.220.202 (talk) at 23:07, 13 September 2017 (Undid revision 800420428 by Deacon Vorbis (talk)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Tree sort
ClassSorting algorithm
Data structureArray
Worst-case performanceO(n²) (unbalanced) O(n log n) (balanced)
Best-case performanceO(n log n) [citation needed]
Average performanceO(n log n)
Worst-case space complexityΘ(n)
OptimalYes, if balanced

A tree sort is a sort algorithm that builds a binary search tree from the elements to be sorted, and then traverses the tree (in-order) so that the elements come out in sorted order. Its typical use is sorting elements online: after each insertion, the set of elements seen so far is available in sorted order.

Efficiency

Adding one item to a binary search tree is on average an O(log n) process (in big O notation). Adding n items is an O(n log n) process, making tree sorting a 'fast sort' process. Adding an item to an unbalanced binary tree requires O(n) time in the worst-case: When the tree resembles a linked list (degenerate tree). This results in a worst case of O(n²) time for this sorting algorithm. This worst case occurs when the algorithm operates on an already sorted set, or one that is nearly sorted. Expected O(n log n) time can however be achieved by shuffling the array.

The worst-case behaviour can be improved by using a self-balancing binary search tree. Using such a tree, the algorithm has an O(n log n) worst-case performance, thus being degree-optimal for a comparison sort. However, trees require memory to be allocated on the heap, which is a significant performance hit when compared to quicksort and heapsort. When using a splay tree as the binary search tree, the resulting algorithm (called splaysort) has the additional property that it is an adaptive sort, meaning that its running time is faster than O(n log n) for inputs that are nearly sorted.

Example

The following tree sort algorithm in pseudocode accepts a collection of comparable items and outputs the items in ascending order:

STRUCTURE BinaryTree
    BinaryTree:LeftSubTree
    Object:Node
    BinaryTree:RightSubTree

PROCEDURE Insert(BinaryTree:searchTree, Object:item)
    IF searchTree.Node IS NULL THEN
        SET searchTree.Node TO item
    ELSE
        IF item IS LESS THAN searchTree.Node THEN
            Insert(searchTree.LeftSubTree, item)
        ELSE
            Insert(searchTree.RightSubTree, item)

PROCEDURE InOrder(BinaryTree:searchTree)
    IF searchTree.Node IS NULL THEN
        EXIT PROCEDURE
    ELSE
        InOrder(searchTree.LeftSubTree)
        EMIT searchTree.Node
        InOrder(searchTree.RightSubTree)

PROCEDURE TreeSort(Collection:items)
    BinaryTree:searchTree
   
    FOR EACH individualItem IN items
        Insert(searchTree, individualItem)
   
    InOrder(searchTree)


In a simple functional programming form, the algorithm (in Haskell) would look something like this:

data Tree a = Leaf | Node (Tree a) a (Tree a)

insert :: Ord a => a -> Tree a -> Tree a
insert x Leaf = Node Leaf x Leaf
insert x (Node t y s)
    | x <= y = Node (insert x t) y s
    | x > y  = Node t y (insert x s)

flatten :: Tree a -> [a]
flatten Leaf = []
flatten (Node t x s) = flatten t ++ [x] ++ flatten s

treesort :: Ord a => [a] -> [a]
treesort = flatten . foldr insert Leaf

In the above implementation, both the insertion algorithm and the retrieval algorithm have O(n²) worst-case scenarios.

Tree Sort code in Python

class TreeNode(object):
	def __init__(self, data):
		self.data = data
		self.left = None
		self.parent = None
		self.printed = False
		self.right = None
	def insert(self, node):
		root = self
		while True:
			if node.data < root.data:
				if root.left is None:
					node.parent = root
					root.left = node
					break
				else:
					root = root.left
			else:
				if root.right is None:
					node.parent = root
					root.right = node
					break
				else:
					root = root.right
	def sorted(self):
		a = []
		node = self
		while True:
			if node.left and not node.left.printed:
				node = node.left
			elif node.data is not None and not node.printed:
				node.printed = True
				a.append(node.data)
			elif node.right and not node.right.printed:
				node = node.right
			else:
				if node.parent:
					node = node.parent
				else:
					return a

def tree_sort(a):
	start = sum(a)/len(a)
	tree = TreeNode(start)
	for x in a:
		tree.insert(TreeNode(x))
	a = tree.sorted()
	a.pop(a.index(start))
	return a