user3611325
user3611325

Reputation: 13

Distributing a list of numbers between sets to minimize number of sets

Say I have a circular list of numbers (last element is considered adjacent to first):

10 10 10 10 9 8 10 8 19

and I want to distribute them among sets of at most a given size such that each set contains consecutive numbers.

For this example, say my set size is 48.

I can do straightforward {10 10 10 10}, {9 8 10 8}, {19} thus yielding 3 sets.

But I can also do {10 10 10 9 8}, {10 8 19 10} - 2 sets (remember list is circular).

How can I calculate minimum number of sets that can contain all numbers? Preferably O(n) with respect to both computation and memory.

Upvotes: 1

Views: 110

Answers (2)

btilly
btilly

Reputation: 46497

Here is linear time.

So you have an array, like [10 10 10 10 9 8 10 8 19] and a bound like 48.

The first step is to figure out all of the maximal groups of consecutive elements whose sum is below your bound. Doing this we're going to populate another array with the position that the next consecutive group has to start at if you have one starting here.

To do this we keep track of where we start from, and where we go to, and the total of our current range. Whenever our total is under our bound, we advance the end and add to the total. Whenever our total is above the bound we mark the start of the next range for the start as end, advance start, and subtract that element from the total. We continue until either we find that everything goes into one range (in which case your final answer is 1) or start wraps around.

In this stage we are always advancing a pointer, we can advance end at most 2n times (more than that we have a wrap and stop) and end n times (until it wraps) so this is O(n) so far.

Now we can build from the end backwards the answer to the connected questions, "How many groups until I wrap around?" and "Where do I wrap around to with this many groups?" For those positions at the end that wrap, the answers are 1 and whatever your next_start is. For earlier ones, the answers are "one more group than my next_start" and "to the place that next_start jumps to. This is anothernoperations forO(n)` still.

Now we can scan the array for candidate start positions. A candidate start position is one that wraps to itself or farther. The size of the partition you get is "the number of groups until I wrap around". Your answer is the candidate start position with the minimum number of groups. This is again a O(n) step and gives you the answer.

Therefore your maximum running time is O(n).

Upvotes: 1

Michael Laszlo
Michael Laszlo

Reputation: 12239

You want to partition your data into sequences such that the total of each sequence does not exceed some constant.

Choose a starting point for the first sequence, say, 0. Greedily add elements to this sequence, then to the next sequence, and so on, until you have consumed all the data. This is one possible partition.

To make another partition, start from position 1. Again consume all the data greedily. This is another possible partition. Then do the same starting from position 2, and so on. You can stop once you have used every starting point that falls within the first sequence of the first partition that you made. Any further starting points will only result in a partition that is identical to one of the earlier ones you made.

data = [10, 10, 10, 10, 9, 8, 10, 8, 19]
max_size = 48
n = len(data)

# make a partition starting from position 0
bounds = [0]
total = 0
for pos in range(n):
  x = data[pos]
  if total + x <= max_size:
    total += x
  else:
    bounds.append(pos)
    total = x

# the first partition ends here
limit = bounds[1]
best_bounds = bounds

# make all other possible partitions
for start in range(1, limit):
  bounds = [start]
  total = 0
  pos = start
  while True:
    x = data[pos]
    if total + x <= max_size:
      total += x
    else:
      bounds.append(pos)
      total = x
    pos = (pos + 1) % n
    if pos == start:
      break
  if len(bounds) < len(best_bounds):
    best_bounds = bounds

# assemble and display the best partition
partition = []
sequence = []
start = best_bounds[0]
index = 1
pos = start
while True:
  if pos == best_bounds[index]:
    partition.append(sequence)
    if pos == start:
      break
    sequence = []
    index = (index+1)%len(best_bounds)
  sequence.append(data[pos])
  pos = (pos + 1) % n
print('partition of size %d: %s' % (len(partition), partition))

Upvotes: 1

Related Questions