BFS implementation in python not fast enough

Question

I'm trying to speed up my BFS algorithm, that solves rush hour games. I feed the algorithm a parent board, and I'm wondering whether the implementation of my BFS algorithm or the building blocks used in my code that I call from the algorithm should change.

This is the BFS algorithm at the moment:

def start_bfs(start_grid):
vehicle_dict = dict()
queue = deque()
queue.append(pd.DataFrame(start_grid))
while queue:
    vehicle_dict.clear()
    df_grid = queue.pop()
    vehicle_dict = set_up(vehicle_dict,df_grid)
    for key, value in vehicle_dict.iteritems():
        check_adjecent(key, value, vehicle_dict)
        movelist, object_location, direction = get_movelist(key, value, vehicle_dict, df_grid)
        if not movelist:
            continue
        while movelist:
                node = movelist.pop(0)
                new_vehicle_dict = move(node, copy.deepcopy(vehicle_dict), df_grid)
                child_df_grid = create_board(new_vehicle_dict)
                if not True in [child_df_grid.equals(x) for x in archive]:
                    check_goal(node[1], new_vehicle_dict, archive, df_grid, child_df_grid) 
                    queue.append(copy.deepcopy(child_df_grid))
                    archive.append(copy.deepcopy(child_df_grid))
        archive.append(copy.deepcopy(df_grid))

The start grid, for example, looks like this:

 start_grid = [['.', '.', 'T', 'F', 'F', 'E'],
         ['.', '.', 'T', '.', '.', 'E'],
         ['.', '.', 'T', 'R', 'R', 'E'],
         ['.', '.', '.', 'C', '.', '.'],
         ['A', 'B', 'B', 'C', '.', '.'],
         ['A', '.', '.', 'C', 'H', 'H']]

Is there something I'm doing inherently wrong here?

Dennis Soemers · Accepted Answer

Do your set_up(vehicle_dict,df_grid), get_movelist(key, value, vehicle_dict, df_grid), and/or move(node, copy.deepcopy(vehicle_dict), df_grid) modify the df_grid passed into them? If not, I highly doubt many of those deep copies you're doing are necessary. You'll want to carefully check to make sure yourself, but I think the following lines dont need all those deep copies:

queue.append(copy.deepcopy(child_df_grid))
archive.append(copy.deepcopy(child_df_grid))
archive.append(copy.deepcopy(df_grid))

I think you can also move archive.append(copy.deepcopy(df_grid)) to before the while loop, so that you can immediately discard moves that do nothing.

Checking whether you've seen a board before using

if not True in [child_df_grid.equals(x) for x in archive]:

also seems inefficient. What kind of object is archive anyway? I'd recommend using a set (not a list!). That may not work correctly if your boards are represented as pd.DataFrames though (which also seems kind of overcomplicated at first glance here without seeing how your other functions are implemented). A simple custom class to contain your data, with correctly implemented __eq__ and __hash__ functions (required for set) would probably be better. Then you can efficiently check if something is truly new using:

if not child_df_grid in archive:

BFS implementation in python not fast enough

Answers (1)

Related Questions