Reputation: 3468
I'm always confused about how dynamic programming uses the matrix to solve a problem. I understand roughly that the matrix is used to store the results from previous subproblems, so that it can be used in later computation of a bigger problem.
But, how does one determine the dimension of the matrix, and how do we know what value each row/column of the matrix should represent? ie, is there like a generic procedure of constructing the matrix?
For example, if we're interested in making changes for S amount of money using coins of value c1,c2,....cn, what should be the dimension of the matrix, and what should each column/row represent?
Any directional guidance will help. Thank you!
Upvotes: 28
Views: 11014
Reputation: 4854
A problem becomes eligible for dynamic programming when it exhibits both Overlapping Sub-problems as well as Optimal Substructure.
Secondly, dynamic programming comes in two variations:
Dynamic Programming stems from the ideology that a large problem can be further broken down into sub-problems. The bottom-up version simply starts with solving these sub-problems first and gradually building up the target solution. The top-down approach relies on using auxiliary storage doing away with re-computation.
is there like a generic procedure of constructing the matrix?
It really depends on what problem you're solving and how you're solving it! Matrices are typically used in tabulation, but it always need not be a matrix. The main goal here is to have the solutions to the sub-problems readily available on demand, it could be stored in an array, a matrix or even a hash-table.
The classic book Introduction to Algorithms demonstrates the solution to the rod-cutting problem in both ways where a 1D array is used as auxiliary storage.
For example, if we're interested in making changes for S amount of money using coins of value c1,c2,....cn, what should be the dimension of the matrix and what should each column/row represent?
If I'm not wrong, you're referring to the "total unique ways to make change" variant of the coin-change problem. You need to find the total ways a given amount can be constructed using given set of coins. There is a great video on this that breaks it down pretty well. It uses a bottom-up approach: https://www.youtube.com/watch?v=DJ4a7cmjZY0
Assume you need to construct amount n = 10
from the given subset of coins c = {1, 2, 10}
Take an empty set and keep adding the coins one per row from c
. For every next row, one coin from the set is added. The columns represent the sub-problems. For i
in n = 1 : 10
, the i
th column represents the the total number of ways i
can be constructed using the coins in that row:
---------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---------------------------------------------------------
|{} | | | | | | | | | | | |
---------------------------------------------------------
|{1} | | X | | | | | | | | | |
---------------------------------------------------------
|{1, 2} | | | | | | | | | | | |
---------------------------------------------------------
|{1, 2, 10}| | | | Y | | | | | | | Z |
---------------------------------------------------------
In this table, X
represents the number of ways amount 1 can be constructed using the coin {1}
, Y
represents the number of ways amount 3 can be represented using the coins {1, 2, 10}
and Z
represents the number of ways amount 10 can be represented using the coins {1, 2, 10}
.
How are the cells populated?
Initially, the entire first column headed by 0
is filled with 1
s because no matter how many coins you have, for the amount 0 you have exactly one way to make change that is to make no change.
The rest of the first row with the empty subset {}
is filled with 0
s because you can't make a change for any positive amount with no coins.
Now the matrix looks like this:
---------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---------------------------------------------------------
|{} | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
---------------------------------------------------------
|{1} | 1 | X | | | | | | | | | |
---------------------------------------------------------
|{1, 2} | 1 | | | | | | | | | | |
---------------------------------------------------------
|{1, 2, 10}| 1 | | | Y | | | | | | | Z |
---------------------------------------------------------
Now, how do we fill X
? You have two alternatives, either to use the 1
coin in this new super set or to not use it. If you did not use the coin, the ways are same as the above row that is 0
. But since 1
can be used to make a change of amount 1
, we use that coin, and subtract 1
from the amount 1
to be left with 0
. Now lookup, 0
's ways in the same row, that is the column previous to that of X
which is 1
. So add it to the amount from the top row to have a total of 1
. So you fill this cell as 1
.
Upvotes: 15
Reputation: 11183
But, how does one determine the dimension of the matrix, and how do we know what value each row/column of the matrix should represent? ie, is there like a generic procedure of constructing the matrix?
You need to find the recurrence relation and the state(number of parameters) required to represent a subproblem. The whole idea of DP is to avoid re-computation of a subproblem. You compute a subproblem only once the first time you require it, store it in memory and refer to the stored value when required. So if you want to refer to the stored result of a subproblem later, you need to have a key that uniquely identifies the subproblem. The state of the subproblem is usually good choice for this key. If a subproblem has 3 parameters x
, y
, z
, then a tuple (value of x, value of y, value of z)
is a good key to store result of the subproblem in a hash table for example. If these values are positive integers, you can use a matrix i.e., multi dimensional array instead of a hash table. Let's develop the ideas of finding the recurrence relation and identifying the state required to uniquely represent a subproblem so that your confusion about the matrix dimensions is cleared.
The most important step in being able to solve a DP problem(any recursive problem in general) is identifying and being able to write down the recurrence relationship. Once the recurrence relation is identified, I'd say 90% of the work is done. Let's first see how to write down the recurrence relation.
Three important ideas in any recursive problem is
Let's take merge sort as example. It is not a DP problem as there are no overlapping subproblems but for the purpose of introducing recurrence relation, it is a good choice as it is famous and easy to understand. As you might already know, the trivial case in merge sort is array of size 0 or 1. Recursion step is to divide the problems into two subproblems of half the size of the current problem and combination step is the merging algorithm. Finally we can write the recurrence relation for merge sort as follows:
sort(0, n) = merge(sort(0, n/2), sort(n/2, n))
In the above recurrence relation for sort
algorithm, the problem of range (0, n) is divided into two subproblems (0, n/2) and (n/2, 0). The combination step is the merge algorithm.
Now let's try to deduce the recurrence relation for some DP problems. You should be able to derive the dimensions of the state(and hence your confusion about dimensions of matrix) from the recurrence relation.
Remember that to find the recurrence relation, we need to identify the subproblems. Identifying subproblems is not always straightforward. Only practice of more DP problems to gain better intuition at these problems and identifying the patterns, trial and error etc are required.
Let's identify the recurrence relations for two problems that look almost similar but require different approach. I chose this problems only because the question was about confusion regarding the dimensions of the matrix.
- Given coins of different denominations and an amount, find the minimum number of coins required to make the amount.
Let's represent the problem/algorithm of finding the minimum number of coins required for a given amount n as F(n)
. If the denominations are p, q, r.
If we know the answer for F(n-p)
, F(n-q)
and F(n-r)
i.e., the minimum number of coins required to make amounts n-p, n-q and n-r respectively, we can take the minimum of these and 1 to get the number of coins required to make the amount n
.
The subproblems here are F(n-p)
, F(n-q)
and F(n-r)
and the combination step is to take the minimum of these values and adding one.
So the recurrence relation is:
F(n) = min(F(n-p), F(n-q), F(n-r)) + 1
# Base conditions
F(0) = 0
F(n) = infinity if n < 0
There is optimal substructure and there are repeated problems(if it is not obvious, take a sample problem and draw the recursion tree) and so we can use some storage to avoid repeated computation. Each of the subproblem is a node in the recursion tree.
From the recurrence relation you can see that the function F takes only one parameter i.e., one parameter is enough to represent the subproblem/node in the recursion tree and hence a 1D array or a hash table keyed by single value can be used to store the result of the subproblems.
- Given coins of different denominations and an amount, find total number of combination of coins required to make the amount.
This problem is more subtle. Pause and think for moment and try to identify the recurrence relation.
Let's use the same terminology as above problem i.e., let's say the amount is n and p, q, r are the denominations.
Does the same recurrence as the above problem work? If F(n)
represents the total number of combinations of counts to make n out of given denominations, can we combine F(n-p)
, F(n-q)
and F(n-r)
is some way to get F(n)? How about just adding them? Does F(n) = F(n-p) + F(n-q) + F(n-r)
hold?
Take n = 3 and two denominations p, q = 1, 2
With above recurrence relation we get the answer as 3 corresponding to the splits [1, 1, 1], [1, 2], [2, 1] which is incorrect as [1, 2] and [2, 1] is the same combination of denominations. The above recurrence is calculating the number of permutations instead of combinations. To avoid the repeated results, we need to bring in order about the coins. We can choose it ourself by mandating that p comes before q and q comes before r. Focus on the number of combination with each denomination. Since we are enforcing the order ourself among the available denominations [p, q, r].
Let's start with p and solve the following recurrence.
F(n, only p allowed) = F(n-p, only p allowed)
## Base condition
F(0) = 1 # There is only one way to select 0 coins which is not selecting any coinss
Now let's allow the next denomination q and then solve the following recurrence.
F(n, p and q allowed) = F(n-q, p and q allowed) + F(n, only p allowed)
Finally,
F(n, p q and r allowed) = F(n-r, p q and r allowed) + F(n, p and q allowed)
The above three recurrence relations in general can be written as follows where i
is the index in the denominations.
# F(n, i) = with denominations[i] + without denominations[i]
F(n, i) = F(n - denominations[i], i) + F(n, i-1)
## Base conditions
F(n, i) = 1 if n == 0
F(n, i) = 0 if n < 0 or i < 0
From the recurrence relation, we can see that you need two state variables to represent a subproblem and hence a 2D array or a hash table keyed by combination of these two values(a tuple for example) is needed to cache the results of subproblems.
Also see Thought process for arriving at dynamic programming solution of Coins change problem
Upvotes: 4
Reputation: 944
This chapter explains it very well: http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf At page 178 it gives some approaches to identify the sub problems that allow you to apply dynamic programming.
Upvotes: 2
Reputation: 13289
An array used by a DP solution is almost always based on the dimensions of the state space of the problem - that is, the valid values for each of its parameters
For example
fib[i+2] = fib[i+1] + fib[i]
Is the same as
def fib(i):
return fib(i-1)+fib(i-2]
You can make this more apparent by implementing memoization in your recursive functions
def fib(i):
if( memo[i] == null )
memo[i] = fib(i-1)+fib(i-2)
return memo[i]
If your recursive function has K parameters, you'll likely need a K-dimensional matrix.
Upvotes: -2