What is the A* search algorithm?
In this article, I will explain the concept and the flowchart of the A* search algorithm.
Table of Contents
What is the A* search algorithm?
A* search algorithm is a kind of search algorithm, and this can be used to find the optimal/shortest path from an initial state to a goal state. This algorithm is an extended version of the Dijkstra’s algorithm and A* search algorithm works more efficiently than the Dijkstra’s algorithm. This search algorithm can be used in several ways. For example, it can be used for car navigation, or to find the best solution to a puzzle.
How does A* search algorithm work?
Starting from an initial state, the A* search algorithm seeks a state which seems the closest to the goal. Say we start from the initial state S, and we want to get to the goal state G. Suppose we can go to 3 other states from S, say A, B, and C. In this case, A* search algorithm takes the state which “seems” the closest to G. Let’s say B is closest to G, and so A* search algorithm takes B as a next state. Then this algorithm takes the closest state and iterates this process until it gets to G. This is the simple explanation of this algorithm.
How do we find the “closest” state?
Now we see the abstract process of A* search algorithm. But how we find the closest state to the goal? How can we define the distance from a state to the goal? Actually, this is not that complicated and A* search algorithm uses f(n) to define the distance to the goal. Here, n is a specific state, and f(n) is the distance from the state n to the goal. Therefore, A* search algorithm always chooses the state which has the min f(n) as the next step.
But, again, how do we calculate the f(n)? We can calculate it with the equation below.
$$f(n) = h(n) + g(n)$$
Here f(n) is the sum of h(n) and g(n). h(n) is the expected distance from the state n to the goal, and g(n) is the expected distance from the initial state to the state n. Therefore, the sum of h(n) and g(n), which is f(n), is the expected distance from the initial state to the goal state passing the state n.
How to calculate the h(n) and g(n)?
The next question you might have is how we can calculate those h(n) and g(n). g(n) is easy to calculate because it is the distance from the initial state. We know the path from the initial state to the state n, so we can just use the information to calculate it. But how about h(n)? We don’t know the path to the goal yet, so what we can do is to predict the distance to the goal. We have to define this h(n) function by ourselves and the efficiency of this algorithm highly depends on the goodness of this function. If h(n) can predict the distance correctly, this algorithm can find the optimal path in a short time and uses less memory. However, if this function cannot predict the distance well, it takes time a lot to find the answer. Next, I will explain how to define this function using an example.
Example: 8 tile puzzle
Now, let’s use the A* search algorithm to find the shortest path to solve an 8 tile puzzle. 8 tile puzzle is the smaller version of the 15 tile puzzle. If you are not familiar with it, check this URL first.
Suppose we start from the initial state below and try to find the path to the goal state below.
Here, 0 means the black space. Now we can move to the other 3 states from the initial state moving the blank.
These are the 3 states we can move to. Now, we have to choose one state which is the closest to the goal state. To do this, we have to calculate the h(n) and g(n) and let me start from the g(n). As you see in the picture above, the g(n) of the initial state is 0, and that of the other states is 1. This is because the distance from the initial state to the initial state is obviously 0, and the distance from the initial state to those 3 states are 1. Let’s move to the h(n).
To get the value h(n), we have to define the function first. Here, for example, I will define the h(n) to be “the number of spaces that is out of the correct position”. So, for example, the h(n) of the initial state is 4 because 0 – 3 is out of the correct position. This function predicts the distance to the goal state to some degree because, the less h(n), closer to the goal state.
The picture above shows the h(n),g(n), and f(n) of each state. Remember, the f(n) = h(n) + g(n), so the f(n) of the next states are 4,6, and 4. Now the left and right states have the same f(n), so we can choose either left or right, but this time let’s take the left one.
Next, we can go to another state from the left state.
The f(n) of the next state is 6 and this is larger than that of the right state, which is 4. Therefore, instead of moving to the next states, we go back and take the right state because it has the smallest f(n) in this picture.
The A* search algorithm iterate this process until it gets to the goal state.
This time, I defined the h(n) to be the number of spaces that are out of the correct position. However, this h(n) is not the best and you can define your own h(n) to make this algorithm faster.
Admissibility of A* search algorithm
Once the algorithm finds a goal state, the path is always the optimal path. It is mathematically proven that once the A* search algorithm finds a path to the goal, it is always the optimal path. This feature is called “admissibility”. Furthermore, the A* search algorithm can always find the optimal path even though the h(n) is defined terribly, which is also mathematically proven. However, the bad h(n) takes much time, so it is not practical if h(n) is not defined well.