Chris.Caldwell
Chris.Caldwell

Reputation: 436

algorithm to maximize sum of unique set with multiple contributors

Im looking for an approach to maximize the value of a common set comprised of contributions from multiple sources with a fixed number of contributions from each.

Example problem: 3 people each have a hand of cards. Each hand contains a unique set, but the 3 sets may overlap. Each player can pick three cards to contribute to the middle. How can I maximize the sum of the 9 contributed cards where

Upvotes: 3

Views: 429

Answers (3)

sascha
sascha

Reputation: 33532

Integer-programming sounds like a viable approach. Without guaranteeing it, this problem also feels NP-hard, meaning: there is no general algorithm beating brute-force (without assumptions about the possible input; IP-solvers actually do assume a lot / are tuned for real-world problems).

(Alternative off-the-shelf approaches: Constraint-programming and SAT-solvers; CP: easy to formulate, faster in regards to combinatorial-search but less good using branch-and-bound style in terms of maximization; SAT: hard to formulate as counters need to build, very fast combinatorial-search and again: no concept of maximization: needs decision-problem like transform).

Here is some python-based complete example solving this problem (in the hard-constraint version; each player has to play all his cards). As i'm using cvxpy, the code is quite in math-style and should be easy to read despite not knowing python or the lib!

Before presenting the code, some remarks:

General remarks:

  • The IP-approach is heavily dependent on the underlying solver!
    • Commercial solvers (Gurobi and co.) are the best
    • Good open-source solvers: CBC, GLPK, lpsolve
    • The default-solver in cvxpy is not ready for this (when increasing the problem)!
    • In my experiment, with my data, commercial solvers scale very well!
    • A popular commercial-solver needs a few seconds for:
      • N_PLAYERS = 40 , CARD_RANGE = (0, 400) , N_CARDS = 200 , N_PLAY = 6
  • Using cvxpy is not best-practice as it's created for very different use-cases and this induces some penalty in times of model-creation time
    • I'm using it because i'm familiar with it and i love it

Improvements: Problem

  • We are solving the each-player-plays-exactly-n_cards here
    • Sometimes there is no solution
    • Your model-description does not formally describe how to handle this
    • General idea to improve the code:
    • bigM-style penalty-based objective: e.g. Maximize(n_unique * bigM + classic_score)
      • (where bigM is a very big number)

Improvements: Performance

  • We are building all those pairwise-conflicts and use a classic not-both constraint
    • The number of conflicts, depending on the task can grow a lot
    • Improvement idea (too lazy to add):
      • Calculate the set of maximal cliques and add these as constraints
      • Will be much more powerful, but:
        • For general conflict-graphs this problem should be NP-hard too, so an approximation algorithm needs to be used
        • (opposed to other applications like time-invervals, where this set can be calculated in polynomial time as the graphs will be chordal)

Code:

import numpy as np
import cvxpy as cvx
np.random.seed(1)

""" Random problem """
N_PLAYERS = 5
CARD_RANGE = (0, 20)
N_CARDS = 10
N_PLAY = 3
card_set = np.arange(*CARD_RANGE)
p = np.empty(shape=(N_PLAYERS, N_CARDS), dtype=int)
for player in range(N_PLAYERS):
    p[player] = np.random.choice(card_set, size=N_CARDS, replace=False)
print('Players and their cards')
print(p)

""" Preprocessing:
    Conflict-constraints
        -> if p[i, j] == p[x, y] => don't allow both
    Could be made more efficient
"""
conflicts = []
for p_a in range(N_PLAYERS):
    for c_a in range(N_CARDS):
        for p_b in range(p_a + 1, N_PLAYERS):  # sym-reduction
            if p_b != p_a:
                for c_b in range(N_CARDS):
                    if p[p_a, c_a] == p[p_b, c_b]:
                        conflicts.append( ((p_a, c_a), (p_b, c_b)) )
# print(conflicts)  # debug

""" Solve """
# Decision-vars
x = cvx.Bool(N_PLAYERS, N_CARDS)

# Constraints
constraints = []

# -> Conflicts
for (p_a, c_a), (p_b, c_b) in conflicts:
    # don't allow both -> linearized
    constraints.append(x[p_a, c_a] + x[p_b, c_b] <= 1)

# -> N to play
constraints.append(cvx.sum_entries(x, axis=1) == N_PLAY)

# Objective
objective = cvx.sum_entries(cvx.mul_elemwise(p.flatten(order='F'), cvx.vec(x)))  # 2d -> 1d flattening
                                                       # ouch -> C vs. Fortran storage
# print(objective)  # debug

# Problem
problem = cvx.Problem(cvx.Maximize(objective), constraints)
problem.solve(verbose=False)
print('MIP solution')
print(problem.status)
print(problem.value)
print(np.round(x.T.value))

sol = x.value
nnz = np.where(abs(sol - 1) <= 0.01)  # being careful with fp-math
sol_p = p[nnz]
assert sol_p.shape[0] == N_PLAYERS * N_PLAY

""" Output solution """
for player in range(N_PLAYERS):
    print('player: ', player, 'with cards: ', p[player, :])
    print(' plays: ', sol_p[player*N_PLAY:player*N_PLAY+N_PLAY])

Output:

Players and their cards
[[ 3 16  6 10  2 14  4 17  7  1]
 [15  8 16  3 19 17  5  6  0 12]
 [ 4  2 18 12 11 19  5  6 14  7]
 [10 14  5  6 18  1  8  7 19 15]
 [15 17  1 16 14 13 18  3 12  9]]
MIP solution
optimal
180.00000005500087
[[ 0.  0.  0.  0.  0.]
 [ 0.  1.  0.  1.  0.]
 [ 1.  0.  0. -0. -0.]
 [ 1. -0.  1.  0.  1.]
 [ 0.  1.  1.  1.  0.]
 [ 0.  1.  0. -0.  1.]
 [ 0. -0.  1.  0.  0.]
 [ 0.  0.  0.  0. -0.]
 [ 1. -0.  0.  0.  0.]
 [ 0.  0.  0.  1.  1.]]
player:  0 with cards:  [ 3 16  6 10  2 14  4 17  7  1]
 plays:  [ 6 10  7]
player:  1 with cards:  [15  8 16  3 19 17  5  6  0 12]
 plays:  [ 8 19 17]
player:  2 with cards:  [ 4  2 18 12 11 19  5  6 14  7]
 plays:  [12 11  5]
player:  3 with cards:  [10 14  5  6 18  1  8  7 19 15]
 plays:  [14 18 15]
player:  4 with cards:  [15 17  1 16 14 13 18  3 12  9]
 plays:  [16 13  9]

Upvotes: 1

Ari
Ari

Reputation: 7556

Looks like a packing problem, where you want to pack 3 disjoint subsets of your original sets, each of size 3, and maximize the sum. You can formulate it as an ILP. Without loss of generality, we can assume the cards represent natural numbers ranging from 1 to N.

  • Let a_i in {0,1} indicate if player A plays card with value i, where i is in {1,...,N}. Notice that if player A doesn't have card i in his hand, a_i is set to 0 in the beginning.
  • Similarly, define b_i and c_i variables for players B and C.
  • Also, similarly, let m_i in {0,1} indicate if card i will appear in the middle, i.e., one of the players will play a card with value i.

Now you can say:

Maximize Sum(m_i . i), subject to:

For each i in {1,...N,}:

  1. a_i, b_i, c_i, m_i are in {0, 1}
  2. m_i = a_i + b_i + c_i
  3. Sum(a_i) = 3, Sum(b_i) = 3, Sum(c_i) = 3

Discussion

Notice that constraint 1 and 2, force the uniqueness of each card in the middle.

I'm not sure how big of a problem can be handled by commercial or non-commercial solvers with this program, but notice that this is really a binary linear program, which might be simpler to solve than the general ILP, so it might be worth trying for the size you are looking for.

Upvotes: 1

Prune
Prune

Reputation: 77837

Sort each hand, dropping duplicate values. Delete anything past the 10-th highest card of any hand (3 hands * 3 cards/hand, plus 1): nobody can contribute a card that low. For accounting purposes, make a directory by card value, showing which hands hold each value. For instance, given players A, B, C and these hands

A [1, 1, 1, 6, 4, 12, 7, 11, 13, 13, 9, 2, 2]

B [13, 2, 3, 1, 5, 5, 8, 9, 11, 10, 5, 5, 9]

C [13, 12, 11, 10, 6, 7, 2, 4, 4, 12, 3, 10, 8]

We would sort and de-dup the hands. 2 is the 10th-highest card of hand c, so we drop all values 2 and below. Then build the directory

A [13, 12, 11, 9, 7, 6]
B [13, 11, 10, 9, 8, 5, 3]
C [13, 12, 11, 10, 8, 7, 6, 4, 3]

Directory:

13  A B C
12  A C
11  A B C
10  B C
 9  A B
 8  B C
 7  A B
 6  A C
 5  B
 4  C
 3  B C

Now, you need to implement a backtracking algorithm to choose cards in some order, get the sum of that order, and compare with the best so far. I suggest that you iterate through the directory, choosing a hand from which to obtain the highest remaining card, backtracking when you run out of contributors entirely, or when you get 9 cards.

I recommend that you maintain a few parameters to allow you to prune the investigation, especially when you get into the lower values.

  • Make a maximum possible value, the sum of the top 9 values in the directory. If you hit this value, stop immediately, as you've found an optimum solution.

  • Make a high starting target: cycle through the hands in sequence, taking the highest usable card remaining in the hand. In this case, cycling A-B-C, we would have

    13, 11, 12, 9, 10, 8, 7, 5, 6 => 81 // Note: because of the values I picked // this happens to provide an optimum solution. // It will do so far a lot of the bridge-hand problem space.

  • Keep count of how many cards have been contributed by each hand; when one has given its 3 cards, disqualify it in some way: have a check in the choice code, or delete it from the local copy of the directory.

As you walk down the choice list, prune the search any time the remaining cards are insufficient to reach the best-so-far total. For instance, if you have a total of 71 after 7 cards, and the highest remaining card is 5, stop: you can't get to 81 with 5+4.

Does that get you moving?

Upvotes: 0

Related Questions