dearn44
dearn44

Reputation: 3432

Pair bad players with good players while retaining ballance

I have a number of players that are split into two groups, good players and bad players. Each player is defined by a an arithmetic value and a possible group could be this one:

bad players
A 13
B 6
C 9
D 2

good players
X 25
Y 16
Z 17
K 10

Pairing a good player with a bad player will cause him to be worse by an amount equal to the bad players' value (so if I pair a good player with value 100 with a bad player of value 50, then the good players value drops to 50). I need to pair each good player with a bad player, but in such a way that the sum of the good players in the resulting list can be split into two groups of equal size that have the same sum.

In my example above a bad pairing would have been:

A - X 12
C - Z 8
B - Y 10
D - K 8

Now because of their pairing with bad players (A B C D) all the good players have lost some points and their value has changed. These new values cant be split into two groups of equal size so as the sum of the values in each group is equal. If I had a different combination though, lets say:

D - K 8
B - Z 11
C - X 16
A - Y 3 

I could split the good players now into the groups K, Z and X, Y that both have a value of 19 and so things remain neutral.

A solution in order to find an acceptable pairing of good and bad players would have been to use brute force and produce all the possible combinations, and check for each one if it is acceptable or not by trying to split it into two groups of equal value. Once the first acceptable solution is found we return it.

How can I improve this and directly find an acceptable pairing in this setting?

Upvotes: 3

Views: 218

Answers (2)

PM 2Ring
PM 2Ring

Reputation: 55479

Here's a reasonably efficient algorithm. It's ok on small data sets, but it will take time & RAM for larger data sets. I won't claim it's the best algorithm, but it's certainly much better than brute-force checking each good - bad pairing.

The key insight is that we don't need to look at the individual good - bad pairings, we can look at the total good - bad score of whole teams. Let the scores for two valid teams be (X, A) and (Y, B). That is, X is the total score of all the good players in team 1 and A is the total score of all the bad players in team 1. Similarly, Y is the score of the good players in team 2 and B is the score of the bad players in that team. Then

X - A = Y - B
X + B = Y + A
2(X + B) = X + B + Y + A
X + B = (X + B + Y + A) / 2
Let M = (X + Y + A + B) / 2
Thus B = M - X

So we just need to calculate M, the total of all players' scores, and then for each good player partition X we need to see if there's a matching bad player partition B. (Incidentally, the arithmetic above shows that X + Y + A + B must be even for a solution to exist, since B & X must be whole numbers).

To do that with reasonable efficiency, we create a dict of the bad player partitions. The dict key is the total score for that partition, the associated value is a list of all bad player partitions with that score.

Then we loop over the good player partition pairs. We find the total score for the 1st partition in a pair, subtract it from M to get the required bad score, and see if that bad score's in the dictionary. If it is, we've found a solution.

The function equal_parts is the fastest way I know in pure Python to split a list into all of its equal length partitions, see this answer I wrote a couple of years ago for details.

The code below finds a single solution for the data given in the question; that solution yields 2 options for each team. I've included some extra code (currently commented out) to generate some random fake data in order to test the code on a larger data set.

Update

There was a minor bug in the logic of the previous version. When I found a list of bad partitions with the correct total score b1 I used bad_total - b1 to get the complementary list of bad partitions for the other team. But that doesn't always work correctly. :oops: The new version uses a bad_parts dictionary that stores each bad partition both as a key and a value, which makes it easy to get the correct complementary partition.

Also, the old make_pairs function was a bit sloppy, so it could produce Team 1 lists and Team 2 lists that shared players. :other oops:. We now have a new function make_teams that calls a repaired make_pairs. The make_teams function prints the various Team 1 and Team 2 permutations in sections. Within each section no Team 1 list will share players with a Team 2 list.

I've also made a few other minor changes, the main one being that the names in a partition tuple are always sorted alphabetically. That makes their use as dictionary keys more robust. I've also tweaked the output format a little.

It may sometimes happen (eg in Solution 23 of the data you supplied in a comment) that an alleged Solution doesn't work: there's no permutation of bad players that can be combined with good players without generating a negative score. Hopefully, that's not a major problem. To make the code deal with that automatically we'd need to change it so that instead of printing results as it goes it would need to store the results in lists (or perhaps use generators), and then test that the resulting lists aren't empty before printing that solution. This would consume a lot more RAM, and would make the logic even more cluttered than it currently is.

from itertools import combinations, permutations
from collections import defaultdict
from random import seed, randrange

seed(37)

def equal_parts(lst):
    ''' yield all equal-sized pair partitions of lst '''
    first = lst[0]
    allset_diff = set(lst).difference
    for left in combinations(lst, len(lst) // 2):
        if left[0] != first:
            break
        yield left, tuple(sorted(allset_diff(left)))

def find_partitions(bad, good):
    print('bad', bad, len(bad))
    print('good', good, len(good))

    bad_total = sum(bad.values())
    good_total = sum(good.values())
    total = bad_total + good_total
    if total % 2 != 0:
        print('No solutions!')
        return

    magic = total // 2
    print('magic =', magic)

    # Create a dict of the good partition pairs
    good_parts = dict(equal_parts(sorted(good)))
    # Create a dict of the bad partition pairs, with each partition of 
    # the pair as a key and the complementary partiton as the value
    bad_parts = {}
    for bpart1, bpart2 in equal_parts(sorted(bad)):
        bad_parts[bpart1] = bpart2
        bad_parts[bpart2] = bpart1
    #print(bad_parts)

    # Get the sums of all the bad partitions, and save them in 
    # a dict of lists with the partition sum as the key and 
    # the partition in the value list
    bad_sums = defaultdict(list)
    for bpart in bad_parts:
        s = sum(bad[k] for k in bpart)
        bad_sums[s].append(bpart)
    bad_sums = dict(bad_sums)
    #print(bad_sums)

    # Sum the 1st of each pair of good partitions & see if there's a matching bad partition 
    count = 0
    for gpart1, gpart2 in good_parts.items():
        g1 = sum(good[k] for k in gpart1)
        b1 = magic - g1
        if b1 in bad_sums:
            count += 1
            print('SOLUTION', count)
            g2 = good_total - g1
            b2 = bad_total - b1
            blist1 = bad_sums[b1]
            # Create the complementary list of bad partitions
            blist2 = [bad_parts[k] for k in blist1]
            tot1 = g1 - b2
            tot2 = g2 - b1
            print(gpart1, g1, '-', blist2, b2, '=', tot1)
            print(gpart2, g2, '-', blist1, b1, '=', tot2, '\n')
            make_teams(gpart1, gpart2, blist1, blist2)

def make_pairs(gpart, bpart):
    for b in permutations(bpart):
        total = 0
        team = []
        for gkey, bkey in zip(gpart, b):
            score = good[gkey] - bad[bkey]
            if score <= 0:
                # Invalid pairing
                break
            total += score
            team.append('{}-{}={}'.format(gkey, bkey, score))
        else:
            print(', '.join(team), total)

def make_teams(gpart1, gpart2, blist1, blist2):
    section = 0
    for bpart2, bpart1 in zip(blist2, blist1):
        section += 1
        print('Section', section)
        print('Team 1:', ' '.join(gpart1), '+', ' '.join(bpart2))
        make_pairs(gpart1, bpart2)
        print('\nTeam 2:', ' '.join(gpart2), '+', ' '.join(bpart1))
        make_pairs(gpart2, bpart1)
        print()

# Make some fake data
def make_data(letters, lo, hi):
    return {s: randrange(lo, hi) for s in letters}

#while True:
    #bad = make_data('ZYXWVU', 1, 15)
    #good = make_data('ABCDEF', 10, 30)
    #bad_total = sum(bad.values())
    #good_total = sum(good.values())
    #if bad_total % 2 == good_total % 2:
        #break

bad = {'A': 13, 'B': 6, 'C': 9, 'D': 2,}
good = {'X': 25, 'Y': 16, 'Z': 17, 'K': 10,}

#bad = {'bA': 7, 'bB': 10, 'bC': 2, 'bD': 12, 'bE': 15, 'bF': 14, 'bG': 17, 'bH': 15} 
#good = {'gA': 19, 'gB': 36, 'gC': 9, 'gD': 15, 'gE': 24, 'gF': 23, 'gG': 24, 'gH': 24}

find_partitions(bad, good)

output

bad {'A': 13, 'B': 6, 'C': 9, 'D': 2} 4
good {'X': 25, 'Y': 16, 'Z': 17, 'K': 10} 4
magic = 49
SOLUTION 1
('K', 'Z') 27 - [('B', 'D')] 8 = 19
('X', 'Y') 41 - [('A', 'C')] 22 = 19 

Section 1
Team 1: K Z + B D
K-B=4, Z-D=15 19
K-D=8, Z-B=11 19

Team 2: X Y + A C
X-A=12, Y-C=7 19
X-C=16, Y-A=3 19

Here's the output from the fake data.

bad {'Z': 2, 'Y': 9, 'X': 5, 'W': 6, 'V': 11, 'U': 12} 6
good {'A': 23, 'B': 28, 'C': 28, 'D': 21, 'E': 28, 'F': 11} 6
magic = 92
SOLUTION 1
('A', 'B', 'C') 79 - [('U', 'V', 'Y')] 32 = 47
('D', 'E', 'F') 60 - [('W', 'X', 'Z')] 13 = 47 

Section 1
Team 1: A B C + U V Y
A-U=11, B-V=17, C-Y=19 47
A-U=11, B-Y=19, C-V=17 47
A-V=12, B-U=16, C-Y=19 47
A-V=12, B-Y=19, C-U=16 47
A-Y=14, B-U=16, C-V=17 47
A-Y=14, B-V=17, C-U=16 47

Team 2: D E F + W X Z
D-W=15, E-X=23, F-Z=9 47
D-W=15, E-Z=26, F-X=6 47
D-X=16, E-W=22, F-Z=9 47
D-X=16, E-Z=26, F-W=5 47
D-Z=19, E-W=22, F-X=6 47
D-Z=19, E-X=23, F-W=5 47

SOLUTION 2
('A', 'B', 'D') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('C', 'E', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47 

Section 1
Team 1: A B D + U V Z
A-U=11, B-V=17, D-Z=19 47
A-U=11, B-Z=26, D-V=10 47
A-V=12, B-U=16, D-Z=19 47
A-V=12, B-Z=26, D-U=9 47
A-Z=21, B-U=16, D-V=10 47
A-Z=21, B-V=17, D-U=9 47

Team 2: C E F + W X Y
C-W=22, E-X=23, F-Y=2 47
C-W=22, E-Y=19, F-X=6 47
C-X=23, E-W=22, F-Y=2 47
C-X=23, E-Y=19, F-W=5 47
C-Y=19, E-W=22, F-X=6 47
C-Y=19, E-X=23, F-W=5 47

Section 2
Team 1: A B D + V X Y
A-V=12, B-X=23, D-Y=12 47
A-V=12, B-Y=19, D-X=16 47
A-X=18, B-V=17, D-Y=12 47
A-X=18, B-Y=19, D-V=10 47
A-Y=14, B-V=17, D-X=16 47
A-Y=14, B-X=23, D-V=10 47

Team 2: C E F + U W Z
C-U=16, E-W=22, F-Z=9 47
C-U=16, E-Z=26, F-W=5 47
C-W=22, E-U=16, F-Z=9 47
C-Z=26, E-U=16, F-W=5 47

SOLUTION 3
('A', 'B', 'E') 79 - [('U', 'V', 'Y')] 32 = 47
('C', 'D', 'F') 60 - [('W', 'X', 'Z')] 13 = 47 

Section 1
Team 1: A B E + U V Y
A-U=11, B-V=17, E-Y=19 47
A-U=11, B-Y=19, E-V=17 47
A-V=12, B-U=16, E-Y=19 47
A-V=12, B-Y=19, E-U=16 47
A-Y=14, B-U=16, E-V=17 47
A-Y=14, B-V=17, E-U=16 47

Team 2: C D F + W X Z
C-W=22, D-X=16, F-Z=9 47
C-W=22, D-Z=19, F-X=6 47
C-X=23, D-W=15, F-Z=9 47
C-X=23, D-Z=19, F-W=5 47
C-Z=26, D-W=15, F-X=6 47
C-Z=26, D-X=16, F-W=5 47

SOLUTION 4
('A', 'C', 'D') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('B', 'E', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47 

Section 1
Team 1: A C D + U V Z
A-U=11, C-V=17, D-Z=19 47
A-U=11, C-Z=26, D-V=10 47
A-V=12, C-U=16, D-Z=19 47
A-V=12, C-Z=26, D-U=9 47
A-Z=21, C-U=16, D-V=10 47
A-Z=21, C-V=17, D-U=9 47

Team 2: B E F + W X Y
B-W=22, E-X=23, F-Y=2 47
B-W=22, E-Y=19, F-X=6 47
B-X=23, E-W=22, F-Y=2 47
B-X=23, E-Y=19, F-W=5 47
B-Y=19, E-W=22, F-X=6 47
B-Y=19, E-X=23, F-W=5 47

Section 2
Team 1: A C D + V X Y
A-V=12, C-X=23, D-Y=12 47
A-V=12, C-Y=19, D-X=16 47
A-X=18, C-V=17, D-Y=12 47
A-X=18, C-Y=19, D-V=10 47
A-Y=14, C-V=17, D-X=16 47
A-Y=14, C-X=23, D-V=10 47

Team 2: B E F + U W Z
B-U=16, E-W=22, F-Z=9 47
B-U=16, E-Z=26, F-W=5 47
B-W=22, E-U=16, F-Z=9 47
B-Z=26, E-U=16, F-W=5 47

SOLUTION 5
('A', 'C', 'E') 79 - [('U', 'V', 'Y')] 32 = 47
('B', 'D', 'F') 60 - [('W', 'X', 'Z')] 13 = 47 

Section 1
Team 1: A C E + U V Y
A-U=11, C-V=17, E-Y=19 47
A-U=11, C-Y=19, E-V=17 47
A-V=12, C-U=16, E-Y=19 47
A-V=12, C-Y=19, E-U=16 47
A-Y=14, C-U=16, E-V=17 47
A-Y=14, C-V=17, E-U=16 47

Team 2: B D F + W X Z
B-W=22, D-X=16, F-Z=9 47
B-W=22, D-Z=19, F-X=6 47
B-X=23, D-W=15, F-Z=9 47
B-X=23, D-Z=19, F-W=5 47
B-Z=26, D-W=15, F-X=6 47
B-Z=26, D-X=16, F-W=5 47

SOLUTION 6
('A', 'D', 'E') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('B', 'C', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47 

Section 1
Team 1: A D E + U V Z
A-U=11, D-V=10, E-Z=26 47
A-U=11, D-Z=19, E-V=17 47
A-V=12, D-U=9, E-Z=26 47
A-V=12, D-Z=19, E-U=16 47
A-Z=21, D-U=9, E-V=17 47
A-Z=21, D-V=10, E-U=16 47

Team 2: B C F + W X Y
B-W=22, C-X=23, F-Y=2 47
B-W=22, C-Y=19, F-X=6 47
B-X=23, C-W=22, F-Y=2 47
B-X=23, C-Y=19, F-W=5 47
B-Y=19, C-W=22, F-X=6 47
B-Y=19, C-X=23, F-W=5 47

Section 2
Team 1: A D E + V X Y
A-V=12, D-X=16, E-Y=19 47
A-V=12, D-Y=12, E-X=23 47
A-X=18, D-V=10, E-Y=19 47
A-X=18, D-Y=12, E-V=17 47
A-Y=14, D-V=10, E-X=23 47
A-Y=14, D-X=16, E-V=17 47

Team 2: B C F + U W Z
B-U=16, C-W=22, F-Z=9 47
B-U=16, C-Z=26, F-W=5 47
B-W=22, C-U=16, F-Z=9 47
B-Z=26, C-U=16, F-W=5 47   

PS. I've thought of a way to handle much larger data sets quickly. It won't find all solutions, but it should find plenty of them. The idea is to divide the large data set into several smaller sets, solve those smaller sets independently, and then combine the solutions. The only tricky part is to do the division so that each smaller set is solvable. So in each set bad_total < good_total and bad_total%2 == good_total%2. It probably makes sense to sort the full data sets by score before attempting the division so that the bad & good groups of each smaller set are roughly matched.

Upvotes: 4

jedwards
jedwards

Reputation: 30210

The following code will produce a valid pairing partitioning:

from itertools import permutations

def find_pairing_partitioning(scores_g, scores_b):
    keys_g = sorted(scores_g.keys())
    keys_b = sorted(scores_b.keys())

    # Some initialization that we can do once
    all_sum = sum(scores_g.values()) - sum(scores_b.values())
    if all_sum % 2 == 1:
        raise RuntimeError("Can't evenly partition an odd sum")

    bin_siz = len(scores_g) // 2                    # How many elements we want in each bin

    for perm_b in permutations(keys_b):
        diff = {}
        for (kg,kb) in zip(keys_g, perm_b):
            kr = '%s-%s' % (kg,kb)
            diff[kr] = scores_g[kg] - scores_b[kb]

        for perm_d in permutations(diff.keys()):    # Note: Inefficient
            bin1_keys = perm_d[:bin_siz]
            bin2_keys = perm_d[bin_siz:]
            bin1_sum  = sum(diff[k] for k in bin1_keys)
            bin2_sum  = sum(diff[k] for k in bin2_keys)

            if bin1_sum == bin2_sum:
                yield bin1_keys, bin2_keys

scores_g = {'X':25, 'Y':16, 'Z':17, 'K':10}     # Scores, good players
scores_b = {'A':13, 'B':6,  'C':9,  'D':2}      # Scores, bad players

bin1_keys, bin2_keys = next(find_pairing_partitioning(scores_g, scores_b))
print("Found even partitioning:", bin1_keys, "vs", bin2_keys)

Example output: Found even partitioning: ('K-B', 'Z-D') vs ('X-A', 'Y-C')

The inner permutations call is inefficient, because it tests what are effectively the same partitionings multiple times, for example:

    A,B  vs.  C,D
    B,A  vs.  C,D
    C,D  vs.  A,B
    ...

So you might look into cleaning that up, or trade runtime efficiency for space efficiency and do something like this.

Upvotes: 3

Related Questions