Reputation:
At a recent phone interview I was asked the following dynamic programming problem but couldn't come up with an algorithm for it:
Suppose there is a path with n
positions. Consider the set S = {A,B,C}
. Each position on the path has an associated non-empty subset of S
. For each position on the path, we can choose one element from its associated subset. For a given position i
on the path, its “value” is determined by the total number of distinct elements from the positions it has access to. The positions it has access to is given by the set {i-1, i, i+1}
(for i=1
it is just {0,1}
and for i=n
it is just {n, n-1}
). We want to maximize the sum of the “value” of all positions.
So for example, if I had n=5
and the following subsets for each position 1…5
:
S1 = {A,C}, S2={A, B}, S3={A,B,C}, S4={A,C}, S5={A,B,C}
Then one such possible arrangement to maximize the sum would be [A, B, C, A, B]
, which would be 2 + 3 + 3 + 3 + 2 = 13
.
I'd like to write an algorithm that, given a list of subsets (where the nth subset corresponds to the nth position), returns the maximum sum of the value of all positions. It should be bounded by a polynomial function of n
.
Example Input: [['A', 'C'], ['A', 'B'], ['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C']]
Expected Output: 13
Given that my phone interview is already over with and that I've still been unable to solve this problem after giving it more thought, I'd rather just see a working solution at this point. Thanks!
Upvotes: 0
Views: 1268
Reputation: 880
I'm relatively certain this iterative version always produces the max, though I can't prove it.
Note: this assumes each set doesn't contain duplicates, if that's the case needs slight modification
if only one position in path, select any value from it's set
else:
starting from first position on path
While (not at last element of path) {
if position set only has 1 value, select it
else if position set has unique value not in neighbors' sets (or single neighbor for ends), select it
else select value that's not the same as prior position, prioritizing a value that's in (next position + 1)'s set (assuming that position isn't out of bounds)
outputArray[position] = value
position++
}
//at last position
if only 1 value select it
else select a value different from previous
outputArray[position] = value
outputArray should now contain values from each set that maximize distinctness among neighbors
Upvotes: 0
Reputation: 5409
The key to solving the problem is to realize that, given an arrangement A
with a certain score
, the new score of A
after appending an element z
depends only on the final two elements of A
.
Given an array ending with the (not necessarily distinct) elements x
and y
, the increase in score after appending an element z
is:
1 // from z on itself
+ 1 if (z != y and z != x) else 0 // from y gaining z as neighbor
+ 1 if (z != y) else 0 // from z gaining y as neighbor
For your example, there are 4 possible arrangements with the first two positions:
Subsets:
S1 = {A, C}, S2 = {A, B},
S3 = {A, B, C}, S4 = {A, C}, S5 = {A, B, C}
After placing the first two elements:
[A, A] max score = 2
[A, B] max score = 4
[C, A] max score = 4
[C, B] max score = 4
after appending a third element (from S3
), all possible 'last two' elements and the maximum score of any arrangement with those 'last two' elements:
After S3 = {A, B, C}
[A, A] max score = 5
[A, B] max score = 7
[A, C] max score = 6
[B, A] max score = 7
[B, B] max score = 5
[B, C] max score = 7
Here, for instance, the unique maximal score arrangement ending in A, A
is [C, A, A]
, although we only care about the last two values and the score.
After all five subsets, the feasible 'last two elements' of arrangements and the maximum score of any corresponding arrangement will be:
[A, A] max score = 11
[A, B] max score = 13
[A, C] max score = 12
[C, A] max score = 13
[C, B] max score = 13
[C, C] max score = 11
so we can return 13. With extra bookkeeping throughout the algorithm, we can also reconstruct the full arrangement.
Here's the three-variable Dynamic Programming (DP) formula:
DP(index, y, z) is defined as the
maximum score for an arrangement on PathArray[0, 1, ..., index]
with final two elements [y, z], for any y in Subsets[index-1]
and z in Subsets[index]
DP(index, y, z) = max over all x in Subsets[index-2] of
(DP(index-1, x, y) + AddedScore(x, y, z))
The answer to your question is the maximum value of DP(n-1, a, b)
for any valid a
and b
.
I've excluded the base case when the path has length 2
: you should directly solve for the score of the one and two element cases.
1
.4
if the elements are not equal, otherwise, the score is 2
.A Python implementation:
def dp_solve(subsets):
if len(subsets) == 1:
return 1
def added_value(grandparent, parent, child) -> int:
return (1
+ (1 if child != grandparent and child != parent else 0)
+ (1 if child != parent else 0))
pair_dict = collections.Counter()
for x, y in itertools.product(subsets[0], subsets[1]):
pair_dict[x, y] = 4 if x != y else 2
for subset in subsets[2:]:
new_dict = collections.Counter()
for old_key, old_value in pair_dict.items():
grandparent, parent = old_key
for child in subset:
new_value = old_value + added_value(grandparent,
parent,
child)
new_dict[parent, child] = max(new_dict[parent, child],
new_value)
pair_dict = new_dict
return max(pair_dict.values())
my_lists = [['A', 'C'], ['A', 'B'], ['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C']]
print(dp_solve(my_lists)) # Prints 13
Upvotes: 0