Jump to content

Sethi–Ullman algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 129.132.179.36 (talk) at 19:41, 19 September 2004. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

When generating code for arithemic expressions, the compiler has to decide which is the best way to translate the expression in terms of number of instructions used as well as number of registers needed to evaluate a certain subtree (especially if free registers are scarce). The so called Sethi-Ullman algorithm fulfills the property of producing code which needs the least number of instructions possible as well as the least number of storage references (under the assumption that at the most commutativity and associativity apply to the operators used, but laws like a * b + a * c = a * (b + c) do not hold). Please note that the algorithm succeeds as well if neither commutativity nor associativity hold for the expressions used, and therefore arithmetic transformations can not be applied.

The simple Sethi-Ullman algorithm works as follows (for a load-store architecture): for every non-constant leaf in the abstract syntax tree, it assigns a 1 (i.e. 1 register is needed to hold the variable/field/etc.). It then goes up in the tree, assigning the number of registers needed to evaluate the respective subtrees of a node. For every node n, the algorithm checks how many registers are needed by the node or leaf below. If the number of registers needed in the left subtree (l) are not equal to the number of registers needed in the right subtree (r), the number of registers needed for the current node is max(l, r). If l==r, then the number of registers needed for the current node is l+1.

How code should be emitted: if the number of registers needed to compute the left subtree is bigger than the number of registers for the right subtree, then the left subtree is evaluated first (since it may be possible that the one more register needed by the right subtree makes the left subtree spill). If the right subtree needs more registers than the left subtree, the right subtree is evaluated first accordingly. If both subtrees need equal much registers, then the order of evaluation is irrelevant.

In an advanced version of the Sethi-Ullman algorithm, the arithmetic expressions are first transformed, exploiting the algebraic properties of the operators used.