Cyborg771
Cyborg771

Reputation: 405

An algorithm to sort a list of values into n groups so that the sum of each group is as close as possible

Basically I have a number of values that I need to split into n different groups so that the sums of each group are as close as possible to the sums of the others? The list of values isn't terribly long so I could potentially just brute force it but I was wondering if anyone knows of a more efficient method of doing this. Thanks.

Upvotes: 16

Views: 6643

Answers (5)

Kris
Kris

Reputation: 1398

This problem is called "multiway partition problem" and indeed is computationally hard. Googling for it returned an interesting paper "Multi-Way Number Paritioning where the author mentions the heuristic suggested by larsmans and proposes some more advanced algorithms. If the above heuristic is not enough, you may have a look at the paper or maybe contact the author, he seems to be doing research in that area.

Upvotes: 5

David Cary
David Cary

Reputation: 5500

Do you know how many groups you need to split it into ahead of time?

Do you have some limit to the maximum size of a group?

A few algorithms for variations of this problem:

Upvotes: 1

Fred Foo
Fred Foo

Reputation: 363817

If an approximate solution is enough, then sort the numbers descendingly, loop over them and assign each number to the group with the smallest sum.

groups = [list() for i in range(NUM_GROUPS)]
for x in sorted(numbers, reverse=True):
    mingroup = groups[0]
    for g in groups:
        if sum(g) < sum(mingroup):
           mingroup = g
    mingroup.append(x)

Upvotes: 9

phkahler
phkahler

Reputation: 5767

You can sum the numbers and divide by the number of groups. This gives you the target value for the sums. Sort the numbers and then try to get subsets to add up to the required sum. Start with the largest values possible, as they will cause the most variability in the sums. Once you decide on a group that is not the optimal sum (but close), you could recompute the expected sum of the remaining numbers (over n-1 groups) to minimize the RMS deviation from optimal for the remaining groups (if that's a metric you care about). Combining this "expected sum" concept with larsmans answer, you should have enough information to arrive at a fast approximate answer. Nothing optimal about it, but far better than random and with a nicely bounded run time.

Upvotes: 1

Geoffrey De Smet
Geoffrey De Smet

Reputation: 27337

Brute force might not work out as well as you think...

Presume you have 100 variables and 20 groups:

  • You can put 1 variable in 20 different groups, which makes 20 combinations.
  • You can put 2 variables in 20 different groups each, which makes 20 * 20 = 20^2 = 400 combinations.
  • You can put 3 variables in 20 different groups each, which makes 20 * 20 * 20 = 20^3 = 8000 combinations.
  • ...
  • You can put 100 variables in 20 different groups each, which makes 20^100 combinations, which is more than the minimum number of atoms in the known universe (10^80).

OK, you can do that a bit smarter (it doesn't matter where you put the first variable, ...) to get to something like Branch and Bound, but that will still scale horribly.

So either use a fast deterministic algorithm, like larsman proposes. Or if you need a more optimal solution and have the time to implement it, take a look at metaheuristic algorithms and software that implement them (such as Drools Planner).

Upvotes: 1

Related Questions