A* search algorithm

Template:Graph search algorithm In computer science, A* (pronounced "A star") is a best-first, tree search algorithm that finds the least-cost path from a given initial node to one goal node (out of one or more possible goals).

It uses a distance-plus-cost heuristic function (usually denoted $f(x)$ ) to determine the order in which the search visits nodes in the tree. The distance-plus-cost heuristic is a sum of two functions: the path-cost function (usually denoted $g(x)$ , which may or may not be a heuristic) and an admissible "heuristic estimate" of the distance to the goal (usually denoted $h(x)$ ). The path-cost function $g(x)$ is the cost from the starting node to the current node.

Since the $h(x)$ part of the $f(x)$ function must be an admissible heuristic, it must underestimate the distance to the goal. Thus for an application like routing, $h(x)$ might represent the straight-line distance to the goal, since that is physically the smallest possible distance between any two points (or nodes for that matter).

The algorithm was first described in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael. In their paper, it was called algorithm A. Since using this algorithm yields optimal behavior for a given heuristic, it has been called A*.

This algorithm has been generalized into a bidirectional heuristic search algorithm; see bidirectional search.

Algorithm description

A* incrementally searches all routes leading from the starting point until it finds the shortest path to a goal. Like all informed search algorithms, it searches first the routes that appear to be most likely to lead towards the goal. What sets A* apart from a greedy best-first search is that it also takes the distance already traveled into account (the $g(x)$ part of the heuristic is the cost from the start, and not simply the local cost from the previously expanded node).

Starting with a given node, the algorithm expands the node with the lowest $f(x)$ value—the node that has the highest cost-per-benefit. A* maintains a set of partial solutions—unexpanded leaf nodes of expanded nodes—stored in a priority queue. The priority assigned to a path $x$ is determined by the function $f(x)=g(x)+h(x)$ . The function continues until a goal has a lower $f(x)$ value than any node in the queue (or until the tree is fully traversed). Multiple goals may be passed over if there is a path that may lead to a lower-cost goal.

The lower $f(x)$ , the higher the priority (so a min-heap could be used to implement the queue).

 function A*(start,goal)
     var closed := ''the empty set''
     var q := make_queue(path(start))
     while q ''is not empty''
         var p := remove_first(q)
         var x := ''the last node of p''
         if x in closed
             continue
         if x = goal
             return p
         ''add x to closed''
         foreach y in successors(x)
             enqueue(q, p, y)
     return failure

The closed set can be omitted (yielding a tree search algorithm) if either a solution is guaranteed to exist, or if the Successors member is adapted to reject cycles.

Properties

Like breadth-first search, A* is complete in the sense that it will always find a solution if there is one.

If the heuristic function $h$ is admissible, meaning that it never overestimates the actual minimal cost of reaching the goal, then A* is itself admissible (or optimal) if we do not use a closed set. If a closed set is used, then $h$ must also be monotonic (or consistent) for A* to be optimal. This means that it never overestimates the cost of getting from a node to its neighbor. Formally, for all paths $x,y$ where $y$ is a successor of $x$ :

g(x)+h(x)\leq g(y)+h(y)

A* is also optimally efficient for any heuristic $h$ , meaning that no algorithm employing the same heuristic will expand fewer nodes than A*, except when there are several partial solutions where $h$ exactly predicts the cost of the optimal path.

While optimal in arbitrary graphs, it is not guaranteed to perform better than simpler search algorithms that are more informed about the problem domain. For example, in a maze-like environment, the only way to reach the goal might be to first travel one way (away from the goal) and eventually double back. In this case trying nodes closer to your destination first may cost you more time.

Special cases

Generally speaking, Depth-first search and breadth-first search are two special cases of A* algorithm. Dijkstra's algorithm, as another example of a best-first search algorithm, is the special case of A* where $h(x)=0$ for all $x$ . For depth-first search, we may consider that there is a global counter C initialized with a very big value. Every time we process a node we assign C to all of its newly discovered neighbors. After each single assignment, we decrease the counter C by one. Thus the earlier a node is discovered, the higher its h(x) value.

Why A* is admissible and computationally optimal

A* is both admissible and considers fewer nodes than any other admissible search algorithm with the same heuristic, because A* works from an “optimistic” estimate of the cost of a path through every node that it considers — optimistic in that the true cost of a path through that node to the goal will be at least as great as the estimate. But, critically, as far as A* “knows”, that optimistic estimate might be achievable.

When A* terminates its search, it has, by definition, found a path whose actual cost is lower than the estimated cost of any path through any open node. But since those estimates are optimistic, A* can safely ignore those nodes. In other words, A* will never overlook the possibility of a lower-cost path and so is admissible.

Suppose now that some other search algorithm A terminates its search with a path whose actual cost is not less than the estimated cost of a path through some open node. Algorithm A cannot rule out the possibility, based on the heuristic information it has, that a path through that node might have a lower cost. So while A might consider fewer nodes than A*, it cannot be admissible. Accordingly, A* considers the fewest nodes of any admissible search algorithm that uses a no more accurate heuristic estimate.

Complexity

The time complexity of A* depends on the heuristic. In the worst case, the number of nodes expanded is exponential in the length of the solution (the shortest path), but it is polynomial when the heuristic function h meets the following condition:

|h(x)-h^{*}(x)|\leq O(\log h^{*}(x))

where $h^{*}$ is the optimal heuristic, i.e. the exact cost to get from $x$ to the goal. In other words, the error of h should not grow faster than the logarithm of the “perfect heuristic” $h^{*}$ that returns the true distance from x to the goal (Russell and Norvig 2003, p. 101).

More problematic than its time complexity is A*’s memory usage. In the worst case, it must also remember an exponential number of nodes. Several variants of A* have been developed to cope with this, including iterative deepening A* (IDA*), memory-bounded A* (MA*) and simplified memory bounded A* (SMA*) and recursive best-first search (RBFS).

References

Dechter, Rina (1985). "Generalized best-first search strategies and the optimality of A*". Journal of the ACM. 32 (3): pp. 505 - 536. {{cite journal}}: |pages= has extra text (help); External link in |title= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
Hart, P. E. (1968). "A Formal Basis for the Heuristic Determination of Minimum Cost Paths". IEEE Transactions on Systems Science and Cybernetics SSC4 (2): pp. 100–107. {{cite journal}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
Hart, P. E. (1972). "Correction to "A Formal Basis for the Heuristic Determination of Minimum Cost Paths"". SIGART Newsletter. 37: pp. 28–29. {{cite journal}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
Nilsson, N. J. (1980). Principles of Artificial Intelligence. Palo Alto, California: Tioga Publishing Company. ISBN 0935382011.
Pearl, Judea (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley. ISBN 0-201-05594-5.
Russell, S. J. (2003). Artificial Intelligence: A Modern Approach. pp. pp. 97-104. ISBN 0-13-790395-2. {{cite book}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

External links

Generation5's A* Explorer - A Windows application allowing you to step through the A* algorithm one move at a time.
Justin Heyes-Jones' A* algorithm tutorial
Herbert Glarner's Interactive Single Step Simulation in VB 6.0, implemented as a DLL, including a GUI allowing simulation in user-defined grids.
Another A* Pathfinding for Beginners (note: incorrectly states that A* always needs a "closed set")
Amit's Thoughts on Path-Finding and A*
Sven Koenig's Demonstration of Lifelong Planning A* and A*
Tony Stentz's Papers on D* (Dynamic A*) Path-Finding
Remko Tronçon and Joost Vennekens's JSearch demo: demonstrates various search algorithms, including A*.
Sune Trudslev's Path finding in C# article
A* search algorithm module in LPC