spidermarn
spidermarn

Reputation: 939

Need help understanding this python depth first search code

I'm quite new to Python coding and have difficulty understanding the following code below. It is on graph theory using DFS to find the largest area amongst all areas of islands. 1 represents an island and 0 represents water in a grid.

def maxAreaOfIsland(grid):
    row, col = len(grid), len(grid[0])
    def dfs(i, j):
        if 0 <= i <= row - 1 and 0 <= j <= col - 1 and grid[i][j]:
            grid[i][j] = 0

#scans through all rows & cols and 
#turns number in the grid into 0 if all conditions are true?

            return  1 + dfs(i - 1, j) + dfs(i + 1, j) + dfs(i, j - 1) + dfs(i, j + 1)
        return 0  

# recursive function that checks up, down, left, right in the grid. 
# when does it return 1?

    return max(dfs(i, j) for i in range(row) for j in range(col))

maxAreaOfIsland([[1,0,1,1,1],
                 [0,0,0,1,1],
                 [1,1,1,0,1]]) 
Out: 6

I have included comments, which reflect my understanding so far, but not sure if it's correct. I'm quite confused from line 4 onwards, particularly the recursive part. Could someone explain in detail? Typically these kind of codes tend to have a queue/dequeue to record whether the island has been visited, but I don't think this code has that?

Upvotes: 0

Views: 527

Answers (1)

user10325516
user10325516

Reputation:

I guess the question is really about understanding algorithm not Python. Provided Python code is pretty easy.

The code contains function maxAreaOfIsland which in turn comtains recursive function dfs. These 2 functions form 2 layers of computation. Lets look at those layers separately.

# outer layer
def maxAreaOfIsland(grid):
    row, col = len(grid), len(grid[0])
    # function dfs() definition
    return max(dfs(i, j) for i in range(row) for j in range(col))

So outer layer is very simple - compute dfs(i, j) for all possible i and j then choose maximum computed value.

# inner layer - slightly modified
def dfs(i, j):
    # recursive case
    if (0 <= i <= row - 1 and 0 <= j <= col - 1) and grid[i][j] == 1:
        grid[i][j] = 0  # this is how we remember visited cells since we don't count zeros
        # optional prints to look at the grid during computation
        # print(i, j)
        # print(*grid, sep='\n', end='\n\n')
        count_current = 1
        count_neighbors = dfs(i - 1, j) + dfs(i + 1, j) + dfs(i, j - 1) + dfs(i, j + 1)
        return  count_current + count_neighbors
    # trivial case and out-of-borders case
    else:
        return 0

Inner layer is a liitle bit more complicated. What it does? (1) It gets i and j. (2) If the cell contains 0 then it's trivial case (water) or we are out of the grid - just return 0. (3) If the cell contains 1 then it's recursive case (land) - function starts to count amount of all the 1 adjacent to the given cell with every 1 counted turning into 0 to avoid double counting.

Your sample grid has 3 rows (0, 1, 2) and 5 columns (0, 1, 2, 3, 4). Suppose we are at i = 0, j = 2. It is 1. We count it (current result is 1), turn it into 0 and look at its neighbors one by one - upper neighbor is out of the grid, bottom neighbor is 0, left neighbor is 0, right neighbor is 1. We dont return current result but proceed to the right neigbor i = 0, j = 3. We count it (cuurent result is 2), turn it into 0 and look at neighbors. Upper neighbor is out of the grid, bottom neighbor is 1. We stop here, we dont return current result, we remember about 2 more neighbors, we proceed to the bottom neighbor i = 1, j = 3. We count it (current result is 3), turn it into 0 and look at neighbors. Upper neighbor is 1. We stop here, we dont return current result, we remember about 3 more neighbors, we proceed to the upper neighbor i = 0, j = 3. And so on.

My advice is to draw simple sample grid (with a pen on a piece of paper) and manually apply dfs algorithm to it.

Upvotes: 1

Related Questions