FGreg
FGreg

Reputation: 15330

Get indices of roughly equal sized chunks

This question is very similar to Python: Slicing a list into n nearly-equal-length partitions.

However, I don't want to actually slice a list. All I want are the start and stop index values of each chunk as if I was slicing the list.

So what I'd like is a function that takes input:

def partitionIndexes( totalsize, numberofpartitions):

and returns a list of tuples representing the start and end index of each partition. Each tuple should span roughly the same number of indices (within 1).

Example:

>>>partitionIndexes( 105, 10 )
[(0, 10)
(11, 21)
(22, 32)
(33, 43)
(44, 54)
(55, 64)
(65, 74)
(75, 84)
(85, 94)
(95, 104)]

Notice how the first five partitions span 11 indices and the last five partitions span 10 indices.

If possible, I'd like to avoid having to generate an intermediate list of all indexes.

Upvotes: 3

Views: 1336

Answers (4)

gsandhu
gsandhu

Reputation: 511

Here is a generator:

def get_chunked_indices(total_size: int, chunk_size: int)->list[int]:
    chunk_iter = 0
    while (chunk_iter < total_size):
        start = chunk_iter
        stop = chunk_iter + chunk_size
        if stop > total_size:
            stop=total_size
        yield list(range(start, stop))
        chunk_iter += chunk_size

Upvotes: 0

Tobias Ernst
Tobias Ernst

Reputation: 4654

Here is a solution returning slices.

def partition_chunks(total_size: int, chunk_size: int)->List[slice]:
    # Create a list of start indices
    chunk_slice = slice(0, total_size, chunk_size)
    values = range(0, total_size)
    start_indices = values[chunk_slice]

    # Create start,end] slices
    slices: List[slice] = [slice(start, start+chunk_size) for start in start_indices]

    # Fix the last partition as end_index of the last slice should be "total_size"
    slices[-1] = slice(start_indices[-1], total_size)

    return slices

Example:

slices: List[slice] = partition_chunks(total_size=1111, chunk_size=100)

Upvotes: 0

Timothy Shields
Timothy Shields

Reputation: 79531

You can do this with a simple generator function.

def partitionIndexes(totalsize, numberofpartitions):
    # Compute the chunk size (integer division; i.e. assuming Python 2.7)
    chunksize = totalsize / numberofpartitions
    # How many chunks need an extra 1 added to the size?
    remainder = totalsize - chunksize * numberofpartitions
    a = 0
    for i in xrange(numberofpartitions):
        b = a + chunksize + (i < remainder)
        # Yield the inclusive-inclusive range
        yield (a, b - 1)
        a = b

Upvotes: 2

FGreg
FGreg

Reputation: 15330

My trivial implementation expands upon this answer to the linked question and abuses the reduce built in.

def partition(totalsize, n):
    lst = range(totalsize)
    chunks = [lst[i::n] for i in xrange(n)]
    indecies = reduce(lambda x, y: reducechunks(x, y), chunks)
    return indecies


def reducechunks(listoftuples, nextchunk):
    if listoftuples[0] == 0:
        # This is the first tuple, need to add it to the list
        listoftuples = [(0, len(listoftuples)-1)]

    # Start of this tuple is the end of the last one plus 1
    start = listoftuples[-1][1] + 1

    # End of this tuple is the start plus 1 minus the length of the current chunk
    end = start + len(nextchunk) - 1

    # Append this tuple to the list of tuples to be passed to the next iteration
    listoftuples.append((start, end))
    return listoftuples

One limitation of this implementation is that it generates a list of all indexes.

Upvotes: 0

Related Questions