Splitting values into similarly distributed evenly sized groups

Question

Given a list of scalar values, how can we split the list into K evenly-sized groups such that the groups have similar distributions? Note that simplicity is strongly favored over efficiency.

I am currently doing:

sort values
create K empty groups: group_1, ... group_k
while values is not empty:
    for group in groups:
        group.add(values.pop())
        if values is empty:
            break

btilly · Accepted Answer

This is a variation on what @m.raynal came up with that will work well even when n is just a fairly small multiple of k.

Sort the elements from smallest to largest.
Create k empty groups.
Put them into a Priority Queue sorted from least elements to most, then largest sum to smallest. (So the next element is always the one with the largest sum among all of those with the fewest elements.)
For each element, take a group off of the priority queue, add that element, put the group back in the priority queue.

In practice this means that the first k elements go to groups randomly, the next k elements go in reverse order. And then it gets clever about keeping things balanced.

Depending on your application, the fact that the bottom two values are spaced predictably far apart could be a problem. If that is the case then you could complicate this by going "middle out". But that scheme is much more complicated.

Splitting values into similarly distributed evenly sized groups

Answers (2)

Related Questions