Partition n players into two teams of size k and equal (or minimally different) strength

I came across this question recently; I thought about it a lot but could not find a solution:

Given a list of n players with strengths [s₁ , s₂, s₃ ... s_n], create two teams (A and B) of size k (k ≤ n/2), so that:

the total strength is maximized

the difference in strength is minimized

Strength(A) = sum of strength of all players in team A,
Strength(B) = sum of strength of all players in team B,
Total strength = strength(A) + strength (B),
Difference in strength = abs(strength(A) - strength(B)).

In case of same total strength, select the combination with the minimum difference in strength.

Example:
n = 5; k = 2  
players:   a  b  c  d  e  
strength:  4  4  2  2  5  

Option  Team A   Team B  Strength  Difference      
  1     [a,b]    [c,e]      15        1
  2     [a,b]    [d,e]      15        1   
  3     [a,c]    [b,e]      15        3
  4     [a,d]    [b,e]      15        3
  5     [a,c]    [b,d]      12        0
  6     [a,d]    [c,b]      12        0
  7     [a,d]    [c,e]      13        1
Option 1 and option 2 are winning combinations as their total strength is 15 (maximum), while their difference in strength is closer to the minimum than options 3 and 4.

My thoughts:

If 2k = n , strength is taken care of already (because all elements will be involved) and we just need to find two halves such that difference of sum of these two is minimum. But how to find that efficiently?

If 2k < n , we can probably sort the strength array and removed n-2k smallest elements and then we are back to 2k = n situation.

Upvotes: 3

Answers (1)

m69 ''snarky and unwelcoming''

Reputation: 12324

As mentioned in the comments, this is a variant of the Partitioning Problem, which itself is a special case of the Subset Sum Problem. These indeed have dynamic programming and approximation solutions, which you may be able to adapt to this problem. But the specific requirement of two equal-sized teams means that non-dp and non-greedy solutions are possible too.

Firstly, optimizing for total strength before taking the difference in strength between the teams into account, means that when the number of players n is odd, the weakest player can be discarded, and the size of the teams k is always half of n. If k is given as part of the input, then take the 2×k strongest players and discard the others.

(You could wonder whether the question was in fact to optimize for strength difference first, and then for total strength. If you find two subsets with difference x, then finding another two subsets with a similar difference y would mean you can combine them into two larger subsets with a smaller difference of |x-y|. This is an obvious basis for a dynamic programming approach.)

Alternative to dynamic programming solution

Let's look at the example of splitting n=23 (i.e. n=22) players into two teams of 11 players. If we used brute force to look at every option, we'd keep one of the players in team A (to avoid duplicate solutions) and try every combination of 10 additional players from the 21 others to complete team A. That means there are:

(n-1 choose k-1) = (21 choose 10) = 352,716 unique options

While this is a feasible number of options to check, larger numbers of players would quickly result in huge numbers of options; e.g. splitting 44 players into two teams of 22 would lead to more than 10¹² options.

We can drastically reduce the number of options that need to be checked, by starting with an initial split into two teams, and then checking which 1 player, 2 players, ... , 10 players we'd need to swap to reduce the strength difference the most. This can be done without having to consider swapping each possible subset of team A with each possible equal-sized subset of team B.

We could do the initial split into teams randomly, but if we sort the players by strength, and alternatingly add a player to team A or team B, this should limit the initial difference in strength D, which in turn should make it more likely that a solution with a limited number of swaps is found quickly (if there are several perfect solutions).

Then we consider swapping 1 player; we make a list of all players in team A (except the first one, which we'll always keep in team A to avoid duplicate solutions) and sort it from weakest to strongest. We also make a list of all players in team B, and sort it from weakest to strongest. Then we iterate over both lists at the same time, at each step moving to the next value in the list that brings the difference in strength between the current player from team A and team B closer to the initial value of D.

Note that we don't compare every player in the first list with every player in the second list in a nested loop. We only iterate over the lists once (this is similar to finding the two integers with the smallest difference in two arrays; see e.g. here).

If we come across a pair of players that, when swapped, decreases D, we store this pair and set the new value of D.

Now we consider swapping 2 players; we make a list of every pair of 2 players from team A (excluding player 1 again) and a list of every pair of players from team B, sort the lists from weakest to strongests (adding the strength of the two players). Then we iterate over both lists again, looking for a pair of pairs that, when swapped, decreases the value of D.

We go on doing the same for sets of 3, 4, ... 10 players. For the example of 23 players, the size of these lists would be:

           team A    team B

swap  1        10        11
swap  2        45        55
swap  3       120       165
swap  4       210       330
swap  5       252       462
swap  6       210       462
swap  7       120       330
swap  8        45       165
swap  9        10        55
swap 10         1        11
             ----      ----
             1023      2046

So, we'd find the optimal swap that results in two teams with the smallest difference in strength after at most 3,069 steps instead of 352,716 steps for the brute-force algorithm.

(We could further speed up the cases where there are several perfect solutions by checking swap sizes in the order 10, 1, 9, 2, 8, 3, 7, 4, 6, 5 to find a solution without having to generate the larger lists.)

The example of splitting 44 players into two teams of 22 would take at most 6,291,453 steps instead of more than 10¹² steps. In general, the maximum number of steps is:

2^k + 2^k−1 − 3

and the time complexity is:

O(2^k)

which doesn't look great, but is much better than the brute-force algorithm with its O(C(n-1,k-1)) complexity. Also, as soon as a solution with difference 0 or 1 is found, there is no need to look at further options, so a solution can be found after considering swaps of only 1 or a handful of players, and the average case complexity is much better than the worst-case complexity (this is discussed further below.)

Code example

Here's a Javascript code snippet as a proof of concept. Selections of players are represented by a bit array (you could also use an integer as a bit pattern). You'll see that the change in team strength after different swaps is calculated, but only one selection of players is actually swapped at the end; so it's not a greedy algorithm that gradually improves the strength difference by performing several swaps.

function compareStrength(a, b) { // for sorting players and selections
    return a.strength - b.strength;
}
function teamStrength(players) {
    return players.reduce(function(total, player) {return total + player.strength;}, 0);
}
function selectionStrength(players, selection) {
    return players.reduce(function(total, player, index) {return total + player.strength * selection[index];}, 0);
}
function nextPermutation(selection) { // reverse-lexicographical next permutation of a bit array
    var max = true, pos = selection.length, set = 1;
    while (pos-- && (max || !selection[pos])) if (selection[pos]) ++set; else max = false;
    if (pos < 0) return false;
    selection[pos] = 0;
    while (++pos < selection.length) selection[pos] = set-- > 0 ? 1 : 0;
    return true;
}
function swapPlayers(wTeam, sTeam, wSelect, sSelect) {
    for (var i = 0, j = 0; i < wSelect.length; i++) {
        if (wSelect[i]) {
            while (!sSelect[j]) ++j;
            var temp = wTeam[i];
            wTeam[i] = sTeam[j];
            sTeam[j++] = temp;
        }
    }
}
function equalTeams(players) {
    // SORT PLAYERS FROM WEAKEST TO STRONGEST
    players.sort(compareStrength);
    // INITIAL DISTRIBUTION OF PLAYERS INTO WEAKER AND STRONGER TEAM (ALTERNATING)
    var wTeam = [], sTeam = [];
    for (var i = players.length % 2; i < players.length; i += 2) {
        wTeam.push(players[i]);
        sTeam.push(players[i + 1]);
    }
    var teamSize = wTeam.length;
    // CALCULATE INITIAL STRENGTH DIFFERENCE
    var initDiff = teamStrength(sTeam) - teamStrength(wTeam);
    var bestDiff = initDiff;
    var wBestSel = [], sBestSel = [];
    // CHECK SELECTIONS OF EVERY SIZE
    for (var selSize = 1; selSize < teamSize && bestDiff > 1; selSize++) {
        var wSelections = [], sSelections = [], selection = [];
        // CREATE INITIAL SELECTION BIT-ARRAY FOR WEAKER TEAM (SKIP PLAYER 1)
        for (var i = 0; i < teamSize; i++)
            selection[i] = (i > 0 && i <= selSize) ? 1 : 0;
        // STORE ALL SELECTIONS FROM WEAKER TEAM AND THEIR STRENGTH
        do wSelections.push({selection: selection.slice(), strength: selectionStrength(wTeam, selection)});
        while (nextPermutation(selection));
        // SORT SELECTIONS FROM WEAKEST TO STRONGEST
        wSelections.sort(compareStrength);
        // CREATE INITIAL SELECTION BIT-ARRAY FOR STRONGER TEAM
        for (var i = 0; i < teamSize; i++)
            selection[i] = (i < selSize) ? 1 : 0;
        // STORE ALL SELECTIONS FROM STRONGER TEAM AND THEIR STRENGTH
        do sSelections.push({selection: selection.slice(), strength: selectionStrength(sTeam, selection)});
        while (nextPermutation(selection));
        // SORT SELECTIONS FROM WEAKEST TO STRONGEST
        sSelections.sort(compareStrength);
        // ITERATE OVER SELECTIONS FROM BOTH TEAMS
        var wPos = 0, sPos = 0;
        while (wPos < wSelections.length && sPos < sSelections.length) {
            // CALCULATE STRENGTH DIFFERENCE IF THESE SELECTIONS WERE SWAPPED
            var wStrength = wSelections[wPos].strength, sStrength = sSelections[sPos].strength;
            var diff = Math.abs(initDiff - 2 * (sStrength - wStrength));
            // SET NEW BEST STRENGTH DIFFERENCE IF SMALLER THAN CURRENT BEST
            if (diff < bestDiff) {
                bestDiff = diff;
                wBestSel = wSelections[wPos].selection.slice();
                sBestSel = sSelections[sPos].selection.slice();
                // STOP SEARCHING IF PERFECT SOLUTION FOUND (DIFFERENCE 0 OR 1)
                if (bestDiff < 2) break;
            }
            // ADVANCE TO NEXT SELECTION FROM WEAKER OR STRONGER TEAM
            if (2 * (sStrength - wStrength) > initDiff) ++wPos; else ++sPos;
        }
    }
    // PERFORM SWAP OF BEST PAIR OF SELECTIONS FROM EACH TEAM
    swapPlayers(wTeam, sTeam, wBestSel, sBestSel);
    return {teams: [wTeam, sTeam], strengths: [teamStrength(wTeam), teamStrength(sTeam)]};
}
var players = [{id:"Courtois", strength:65}, {id:"Mignolet", strength:21}, {id:"Casteels", strength:0},
               {id:"Alderweireld", strength:83}, {id:"Vermaelen", strength:69}, {id:"Kompany", strength:82},
               {id:"Vertonghen", strength:108}, {id:"Meunier", strength:30}, {id:"Boyata", strength:10},
               {id:"Dendoncker", strength:6}, {id:"Witsel", strength:96}, {id:"De Bruyne", strength:68},
               {id:"Fellaini", strength:87}, {id:"Carrasco", strength:30}, {id:"Tielemans", strength:13},
               {id:"Januzaj", strength:9}, {id:"Dembele", strength:80}, {id:"Chadli", strength:51},
               {id:"Lukaku", strength:75}, {id:"E. Hazard", strength:92}, {id:"Mertens", strength:75},
               {id:"T. Hazard", strength:13}, {id:"Batshuayi", strength:19}];
var result = equalTeams(players);
for (var t in result.teams) {
    for (var i in result.teams[t]) {
        document.write(result.teams[t][i].id + " (" + result.teams[t][i].strength + ") ");
    }
    document.write("<br>&rarr; team strength = " + result.strengths[t] + "<br><br>");
}

Probability of finding a perfect solution

When the algorithm finds a perfect solution (with a strength difference of 0 or 1), this cannot be improved further, so the algorithm can stop looking at other options and return the solution. This of course means that, for some input, a solution can be found almost instantly, and the algorithm can be used for a large number of players.

If there is no perfect solution, the algorithm has to run its full course to be sure it has found the best solution. With a large number of players, this could take a long time and use a lot of memory space (I was able to run a C++ version for up to 64 players on my computer).

Although it's straightforward to craft input that has no perfect solution (such as one player with strength 3 and the other players all with strength 1), testing with random data showed that the number of players for which almost all random input has a perfect solution is surprisingly low (similar to the Birthday Paradox).

With n=24 (two teams of 12) or more, ten million instances of random input provided not a single case where the strength difference between the teams was greater than 1, whether using 10, 100, 1000 or 10000 different integer values to express the strength of each player.

Upvotes: 1

Partition n players into two teams of size k and equal (or minimally different) strength

Answers (1)

Related Questions