How do I make my A star search algorithm more efficient?

Question

I have a grid in matplotlib (20*20 or 40*40 based on user's choice) that contains data divided based on LatLong location. Each cell in that grid represents a 0.002 or 0.001 area (ex: [-70.55, 43.242] to [-70.548, 43.244]). The grid is coloured basesd on threshold value, say anything above 30 is green and anything below is red.

I implemented the A start algorithm to go from one location (cell) to another on that graph while avoiding all green cells. The cost of traveling on a border of a green and red cell is 1.3 while the diagonal cost is 1.5, and traveling between two red cells is 1.

I'm using the diagonal distance heuristic, and for each cell I'm getting all possible neighbours and setting their G value based on the threshold.

Now I'm able to get a correct path most of the time, and for cells that are nearby, it runs at sub 1 sec. but when I go further it takes 14-18 seconds.

I don't understand what wrong I'm doing here? I've been trying to figure out a way to improve it but failed.

Here's a fragment of the algorithm. I would like to note that determining the accessible neighbours and setting the G value might not be issue here because each function call runs at around 0.02 - 0.03 seconds.

Any advice is appreciated! Thanks

def a_star(self, start, end):
    startNode = Node(None, start)
    endNode = Node(None, end)
    openList = []
    closedList = []
    openList.append(startNode)

    while len(openList) > 0:

        current = openList[0]
        if current.location == endNode.location:
            path = []
            node = current
            while node is not None:
                path.append(node)
                node = node.parent
            return path[::-1]

        openList.pop(0)
        closedList.append(current)

       # This takes around 0.02 0.03 seconds
        neighbours = self.get_neighbors(current)

        for neighbour in neighbours:
            append = True
            for node in closedList:
                if neighbour.location == node.location:
                    append = False
                    break
            for openNode in openList:
                if neighbour.location == openNode.location:
                    if neighbour.g < openNode.g:
                        append = False
                        openNode.g = neighbour.g
                        break
            if append:

                neighbour.h = round(self.heuristic(neighbour.location, endNode.location), 3)
                neighbour.f = round(neighbour.g + neighbour.h, 3)

                bisect.insort_left(openList, neighbour)
    return False

Edit: Add Node fragment

 class Node:
    def __init__(self, parent, location):
        self.parent = parent
        self.location = location
        self.g = 0
        self.h = 0
        self.f = 0

Edit 2: Add Image

Circle is start location, star is destination. Yellow cells are not accessible, so no diagonal path on yellow cells, and cannot travel between two yellow cells.

trincot · Accepted Answer

As @lenic already pointed out, the inner loops you have do not belong in an A* algorithm.

The first one (for node in closedList) should be replaced by a check of whether a node is in a set (instead of a list): that way you don't need iteration.

The seconde one (openNode in openList) is unnecessary when you initialise all g property values with infinity (except the start node). Then you can just compare the new g value with the one that is already stored in the neighbor node.

Furthermore, I would suggest creating nodes for the whole graph as soon as the graph is created. That will be of benefit when you need to do multiple queries on the same graph.

Also, instead of bisect.insort_left I would propose using heapq.heappush. This will be faster, as it does not really sort the queue completely, but just ensures the heap property is maintained. The time complexity is the same though. More importantly, the time complexity of getting the next value from it is better than openList.pop(0).

I would suggest working with costs 10, 13 and 15 instead of 1, 1.3 and 1.5, as integer operations have no issues with precision.

For the same reason I would not work with fractional location coordinates. Since they are all just as far apart (e.g. 0.002), we can just use sequential integers (0, 1, 2, ...) for both coordinates. An extra function could take the solution and a reference coordinate pair to convert those integer coordinates back to "world" coordinates.

I have made some assumptions:

It is strictly forbidden to go through "hostile" cells. No edges will exist for crossing those. For instance, if the starting location is in the middle of four hostile cells, there will be no path. One could consider an alternative where these edges would get an extreme high cost, so that you would always be able to propose a path.
The straight edges on the boundary of the grid are all allowed. Their cost will be 1.3 (13) when the single cell, that neighbors it, is hostile. So actually in each of the two dimensions there is one more location than there are cells
The threshold input will be a fraction between 0 and 1, indicating the fraction of the cells that should be friendly with respect to the total number of cells, and this value will be converted to a "split" value to distinguish between friendly and hostile cells.

Here is the code you could use as inspiration:

from heapq import heappop, heappush

class Node:
    def __init__(self, location):
        self.location = location
        self.neighbors = []
        self.parent = None
        self.g = float('inf')
        self.f = 0

    def clear(self):
        self.parent = None
        self.g = float('inf')
        self.f = 0

    def addneighbor(self, cost, other):
        # add edge in both directions
        self.neighbors.append((cost, other))
        other.neighbors.append((cost, self))

    def __gt__(self, other):  # make nodes comparable
        return self.f > other.f

    def __repr__(self):
        return str(self.location)

class Graph:
    def __init__(self, grid, thresholdfactor):
        # get value that corresponds with thresholdfactor (which should be between 0 and 1)
        values = sorted([value for row in grid for value in row])
        splitvalue = values[int(len(values) * thresholdfactor)]
        print("split at ", splitvalue)
        # simplify grid values to booleans and add extra row/col of dummy cells all around
        width = len(grid[0]) + 1
        height = len(grid) + 1
        colors = ([[False] * (width + 1)] +
            [[False] + [value < splitvalue for value in row] + [False] for row in grid] +
            [[False] * (width + 1)])

        nodes = []
        for i in range(height):
            noderow = []
            nodes.append(noderow)
            for j in range(width):
                node = Node((i, j))
                noderow.append(node)
                cells = [colors[i+1][j]] + colors[i][j:j+2]  # 3 cells around location: SW, NW, NE
                for di, dj in ((1, 0), (0, 0), (0, 1), (0, 2)):  # 4 directions: W, NW, N, NE
                    cost = 0
                    if (di + dj) % 2:  # straight
                        # if both cells are hostile, then not allowed
                        if cells[0] or cells[1]:  # at least one friendly cell
                            # if one is hostile, higher cost
                            cost = 13 if cells[0] != cells[1] else 10
                        cells.pop(0)
                    elif cells[0]:  # diagonal: cell must be friendly
                        cost = 15
                    if cost:
                        node.addneighbor(cost, nodes[i-1+di][j-1+dj])
        self.nodes = nodes

    @staticmethod
    def reconstructpath(node):
        path = []
        while node is not None:
            path.append(node)
            node = node.parent
        path.reverse()
        return path

    @staticmethod
    def heuristic(a, b):
        # optimistic score, assuming all cells are friendly
        dy = abs(a[0] - b[0])
        dx = abs(a[1] - b[1])
        return min(dx, dy) * 15 + abs(dx - dy) * 10

    def clear(self):
        # remove search data from graph 
        for row in self.nodes:
            for node in row:
                node.clear()

    def a_star(self, start, end):
        self.clear()
        startnode = self.nodes[start[0]][start[1]]
        endnode = self.nodes[end[0]][end[1]]
        startnode.g = 0
        openlist = [startnode] 
        closed = set()
        while openlist:
            node = heappop(openlist)
            if node in closed:
                continue
            closed.add(node)
            if node == endnode:
                return self.reconstructpath(endnode)
            for weight, neighbor in node.neighbors:
                g = node.g + weight
                if g < neighbor.g:
                    neighbor.g = g
                    neighbor.f = g + self.heuristic(neighbor.location, endnode.location)
                    neighbor.parent = node
                    heappush(openlist, neighbor)

I encoded the graph you included as image, to see how the code would behave:

grid = [
    [38, 32, 34, 24,  0, 82,  5, 41, 11, 32,  0, 16,  0,113, 49, 34, 24,  6, 15, 35],
    [61, 61,  8, 35, 65, 31, 53, 25, 66,  0, 21,  0,  9,  0, 31, 75, 20,  8,  3, 29],
    [43, 66, 47,114, 38, 41,  1,108,  9,  0,  0,  0, 39,  0, 27, 72, 19, 14, 24, 25],
    [45,  5, 37, 23,102, 25, 49, 34, 41, 49, 35, 15, 29, 21, 66, 67, 44, 31, 38, 91],
    [47, 94, 48, 69, 33, 95, 18, 75, 28, 70, 38, 78, 48, 88, 21, 66, 44, 70, 75, 23],
    [23, 84, 53, 23, 92, 14, 71, 12,139, 30, 63, 82, 16, 49, 76, 56,119,100, 47, 21],
    [30,  0, 32, 90,  0,195, 85, 65, 18, 57, 47, 61, 40, 32,109,255, 88, 98, 39,  0],
    [ 0,  0,  0,  0, 39, 39, 76,167, 73,140, 58, 56, 94, 61,212,222,141, 50, 41, 20],
    [ 0,  0,  0,  5,  0,  0, 21,  2,132,100,218, 81,  0, 62,135, 42,131, 80, 14, 19],
    [ 0,  0,  0,  0,  0, 15,  9, 55, 70, 71, 42,117, 65, 63, 59, 81,  4, 40, 77, 46],
    [ 0,  0,  0,  0, 55, 52,101, 93, 30,166, 56, 19, 76,103, 54, 37, 24, 23, 59, 98],
    [ 0,  0,  0,  0,  9,  9, 44,149, 11,134, 90, 64, 44, 57, 61, 79,270,201, 84,  6],
    [ 0,  0,  0, 22,  1, 15,  0, 25, 30,101,154, 60, 97, 64, 15,162, 27, 91, 71,  0],
    [ 0,  0,  1, 35,  5, 10,  0, 55, 25,  0,200, 81, 31, 53, 42, 74,127,154,  7,  0],
    [ 0,  0,187, 17, 45, 66, 91,191, 70,189, 18, 25, 67, 32, 40, 79,103, 79, 59,  0],
    [ 0, 21, 16, 14, 19, 58,278, 56,128, 95,  3, 52,  9, 27, 25, 43, 62, 25, 38,  0],
    [ 4,  3, 11, 26,119,165, 53, 85, 46, 81, 19, 11, 12, 19, 18,  9, 16,  6, 37,  0],
    [ 5,  0,  0, 65,158,153,118, 38,123, 46, 28, 24,  0, 21, 11, 20,  5,  1, 10,  0],
    [17,  4, 28, 81,101,101, 46, 25, 44, 12, 41,  6, 27,  8,  4, 32, 40,  1,  1,  0],
    [26, 20, 84, 42,112, 27, 14, 16,  5, 13,  3, 43,  6, 18, 12, 44,  5,  0,  0,  5]
]

graph = Graph(grid, 0.5) # provide the threshold at initialisation
path = graph.a_star((7, 4), (14, 18))
print(path)

How do I make my A star search algorithm more efficient?

Answers (2)

Related Questions