Reputation: 267
I was recently solving the following leet code problem "Edit Distance". I came up with the following code, which passes:
def dfs(i, j):
if j == len(word2) or i == len(word1):
return (len(word1) - i) + (len(word2) - j)
elif (i, j) in dp:
return dp[(i, j)]
res = float("inf")
if word1[i] == word2[j]:
res = min(res, dfs(i + 1, j + 1))
res = min(res, 1 + dfs(i, j + 1), 1 + dfs(i + 1, j), 1 + dfs(i + 1, j + 1))
dp[(i, j)] = res
return dp[(i, j)]
dp = {}
return dfs(0, 0)
However, as I looked through solutions, I noticed that many users employed a slight optimization compared to my code; I adapted my code to represent this optimization.
def dfs(i, j):
if j == len(word2) or i == len(word1):
return (len(word1) - i) + (len(word2) - j)
elif (i, j) in dp:
return dp[(i, j)]
res = float("inf")
if word1[i] == word2[j]:
res = min(res, dfs(i + 1, j + 1))
else:
res = min(res, 1 + dfs(i, j + 1), 1 + dfs(i + 1, j), 1 + dfs(i + 1, j + 1))
dp[(i, j)] = res
return dp[(i, j)]
dp = {}
return dfs(0, 0)
The change is subtle and can be summarized in the following way. If we identify a match between two indices i
and j
, we can greedily choose to proceed with the subproblem represented by indices i + 1
and j + 1
, completely ignoring the subproblems represented by replacing, deleting, or inserting at the previous index. In other words, if a given index pair has a match, we can guarantee that performing no operations at this point will yield a more optimal solution than replacing, deleting, or inserting at this index. However, I am struggling to prove that this statement is true. At a given subproblem, it is obvious that performing no operations is better than replacing, since both result in the same sub-problem (i + 1
, j + 1
). However, we must add one to the replace total to reflect replacing at index i
; thus, it is obvious that no operations will always produce a more optimal result. However, for the cases of inserting or deleting an index i
, proving no-operation optimality is not so simple. Namely, inserting or deleting lead to different sub-problems that do not blatantly imply that no operations will yield an optimal result. In the case of inserting, we add one to account for the insertion at index i
and proceed with the sub-problem (i, j + 1)
. On the other hand, for deleting, we add one to acccount for the deletion at index i
and proceed with the subproblem (i + 1, j)
. As such, I need to somehow prove that the result of subproblem 1 + (i, j + 1)
and 1 + (i + 1, j)
are both less than or equal to the result of subproblem (i + 1, j + 1)
. I do not know how to proceed. Any advice?
Upvotes: 0
Views: 37