Reputation: 405
I have looked at the implementation of the problem on the internet and i have a question: whenever you add a node you need to search it in the open list(why it is not enough to search it in the closed list?)? And why do you search it to see if you can minimize the cost given the fact that you always generate the neighbors of the smallest score node in the open list? A node in the open list with an older parent does have a smaller cost than the same node but with a newer parent?
Upvotes: 1
Views: 25783
Reputation: 1740
I'll start with a short, question-relevant, background to Best First Search, and then I'll specifically address both of your questions (in hope that, by then, they have already been answered).
This search is famous for using a special evaluation function that estimates the distance to the goal node (an estimation for "how far are you from the goal node?"). This function is called "heuristic". For example, a very simple and intuitive heuristic for a navigation problem (shortest path from city A to city B) would use the heuristic of the Euclidean distance, which is the shortest possible route (as if the two cities are directly connected by a road). This function has an important property of being admissible, since it is a lower estimate of that distance. This is crucial for the proof of Best First Search being able to find the optimal solution.
Given such an heuristic function, the algorithm now prioritizes all discovered nodes by their cost (the heuristic function). This is done using a priority queue. Actually, you can implement Best First Search using the exact BFS algorithm, by only switching BFS's queue with a priority queue. This algorithm is sometimes referred to as Informed Search, because it has some prior knowledge of the world (expressed via the heuristic function) and it uses it to guide the search (expressed via searching nodes by their heuristic value).
You can often hear about the "Expand" operation in search algorithms. In big domains, where the search instance is too big to be populated in-memory, this concept is crucial. However, this idea can be abstractly applied to small domains as well, without any additional cost. The idea here is that each node has an operator Expand
that, once invoked, returns (in big domains: create and returns) the list of all neighbor nodes. A node that was reached through the Expand
operation is often called generated.
Best First Search maintains two lists during the search: Open List and Closed List.
The open list is a collection of all generated nodes. This means that those are nodes that were neighbors of expanded nodes. As mentioned above, the open list is often implemented as a priority queue so the search can simply dequeue the next best (i.e., highest priority) node.
The closed list is a collection of all expanded nodes. This means that those are nodes that were already "searched". This prevents the search from visiting nodes again and again. A side note: in big domains, the closed list can't fit all nodes, so the closed list has to be implemented smartly. For example, it is possible to reduce the memory required using a Bloom Filter.
1. "Why it is not enough to search it in the closed list?"
I hope you already understand the answer to this question. The closed list starts empty and during the search, expanded nodes are being pushed into it. These are nodes that are not the goal node. but rather "visited" nodes that were already traversed.
2. "Why do you search it to see if you can minimize the cost given the fact that you always generate the neighbors of the smallest score node in the open list?"
As mentioned above, an admissible heuristic only promises an under-estimation of the actual cost to get to the goal node. This means that simple taking the smallest each time doesn't promise to find the cheapest/shortest path. However, a huge benefit of the algorithm is that it can "truncates"/"prune" expensive paths, once a path was already found. For example, a path of cost 100 was found. We still do not know if it is the optimal path. We need to keep searching the graph, but now we can prune any path the leaves a node with an heuristic value of 100 or greater, since we already found a similar or cheaper route. Note that this allows us to finish the search without necessarily searching all of the graph.
Upvotes: 20