user248237
user248237

Reputation:

Splitting a list into N parts of approximately equal length

What is the best way to divide a list into roughly equal parts? For example, if the list has 7 elements and is split it into 2 parts, we want to get 3 elements in one part, and the other should have 4 elements.

I'm looking for something like even_split(L, n) that breaks L into n parts.

def chunks(L, n):
    """ Yield successive n-sized chunks from L.
    """
    for i in range(0, len(L), n):
        yield L[i:i+n]

The code above gives chunks of 3, rather than 3 chunks. I could simply transpose (iterate over this and take the first element of each column, call that part one, then take the second and put it in part two, etc), but that destroys the ordering of the items.

Upvotes: 251

Views: 428178

Answers (30)

bitagoras
bitagoras

Reputation: 506

This will do the split into equally sized parts by one single expression:

myList = list(range(18))  # given list
N = 5  # desired number of parts

[myList[(i*len(myList))//N:((i+1)*len(myList))//N] for i in range(N)]
# [[0, 1, 2], [3, 4, 5, 6], [7, 8, 9], [10, 11, 12, 13], [14, 15, 16, 17]]

The parts will differ by not more than one element. The split of a list with 18 elements into 5 parts results in chunks of 3 + 4 + 3 + 4 + 4 = 18 elements where the two different sizes appear in alternating order.

An alternative code results in a slightly different partitioning with first all larger parts followed by the smaller parts:

n, m = divmod(len(myList), N)
[myList[i*n+min(i,m):(i+1)*n+min(i+1,m)] for i in range(N)]
# [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14], [15, 16, 17]]

The chunk sizes are now 4 + 4 + 4 + 3 + 3 = 18. The same result can be achieved by the function numpy.array_split(myList, N) from the numpy module.

Upvotes: 15

meferne
meferne

Reputation: 369

The same behavior like numpy.array_split written as a generator, and fast to understand:

def split(a, n):
    k, m = divmod(len(a), n)
    for i in range(n):
        size = k + 1 if i < m else k
        yield a[:size]
        a = a[size:]

Upvotes: 0

m3trik
m3trik

Reputation: 333

Here's a single function that handles most of the various split cases:

def splitList(lst, into):
    '''Split a list into parts.

    :Parameters:
        into (str) = Split the list into parts defined by the following:
            '<n>parts' - Split the list into n parts.
                ex. 2 returns:  [[1, 2, 3, 5], [7, 8, 9]] from [1,2,3,5,7,8,9]
            '<n>parts+' - Split the list into n equal parts with any trailing remainder.
                ex. 2 returns:  [[1, 2, 3], [5, 7, 8], [9]] from [1,2,3,5,7,8,9]
            '<n>chunks' - Split into sublists of n size.
                ex. 2 returns: [[1,2], [3,5], [7,8], [9]] from [1,2,3,5,7,8,9]
            'contiguous' - The list will be split by contiguous numerical values.
                ex. 'contiguous' returns: [[1,2,3], [5], [7,8,9]] from [1,2,3,5,7,8,9]
            'range' - The values of 'contiguous' will be limited to the high and low end of each range.
                ex. 'range' returns: [[1,3], [5], [7,9]] from [1,2,3,5,7,8,9]
    :Return:
        (list)
    '''
    from string import digits, ascii_letters, punctuation
    mode = into.lower().lstrip(digits)
    digit = into.strip(ascii_letters+punctuation)
    n = int(digit) if digit else None

    if n:
        if mode=='parts':
            n = len(lst)*-1 // n*-1 #ceil
        elif mode=='parts+':
            n = len(lst) // n
        return [lst[i:i+n] for i in range(0, len(lst), n)]

    elif mode=='contiguous' or mode=='range':
        from itertools import groupby
        from operator import itemgetter

        try:
            contiguous = [list(map(itemgetter(1), g)) for k, g in groupby(enumerate(lst), lambda x: int(x[0])-int(x[1]))]
        except ValueError as error:
            print ('{} in splitList\n   # Error: {} #\n {}'.format(__file__, error, lst))
            return lst
        if mode=='range':
            return [[i[0], i[-1]] if len(i)>1 else (i) for i in contiguous]
        return contiguous

r = splitList([1, '2', 3, 5, '7', 8, 9], into='2parts')
print (r) #returns: [[1, '2', 3, 5], ['7', 8, 9]]

Upvotes: 0

shmsi
shmsi

Reputation: 1078

The other solutions seem to be a bit long. Here is a one-liner using list comprehension and the NumPy function array_split. array_split(list, n) will simply split the list into n parts.

[x.tolist() for x in np.array_split(range(10), 3)]

Upvotes: 6

wim
wim

Reputation: 363294

This is the raison d'être for numpy.array_split*:

>>> import numpy as np
>>> print(*np.array_split(range(10), 3))
[0 1 2 3] [4 5 6] [7 8 9]
>>> print(*np.array_split(range(10), 4))
[0 1 2] [3 4 5] [6 7] [8 9]
>>> print(*np.array_split(range(10), 5))
[0 1] [2 3] [4 5] [6 7] [8 9]

*credit to Zero Piraeus in room 6

Upvotes: 287

GusW
GusW

Reputation: 1

def chunkify(target_list, chunk_size):
    return [target_list[i:i+chunk_size] for i in range(0, len(target_list), chunk_size)]

>>> l = [5432, 432, 67, "fdas", True, True, False, (4324,131), 876, "ofsa", 8, 909, b'765']
>>> print(chunkify(l, 3))
>>> [[5432, 432, 67], ['fdas', True, True], [False, (4324, 131), 876], ['ofsa', 8, 909], [b'765']]

Upvotes: 0

tixxit
tixxit

Reputation: 4122

You can write it fairly simply as a list generator:

def split(a, n):
    k, m = divmod(len(a), n)
    return (a[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))

Example:

>>> list(split(range(11), 3))
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10]]

Upvotes: 342

Shreyash Gorai
Shreyash Gorai

Reputation: 9

n = len(lst)
# p is the number of parts to be divided
x = int(n/p)

i = 0
j = x
lstt = []
while (i< len(lst) or j <len(lst)):
    lstt.append(lst[i:j])
    i+=x
    j+=x
print(lstt)

This is the simplest answer if it is known that the list divides into equal parts.

Upvotes: -1

Petr Vepřek
Petr Vepřek

Reputation: 1648

Another attempt at simple readable chunker that works.

def chunk(iterable, count): # returns a *generator* that divides `iterable` into `count` of contiguous chunks of similar size
    assert count >= 1
    return (iterable[int(_*len(iterable)/count+0.5):int((_+1)*len(iterable)/count+0.5)] for _ in range(count))

print("Chunk count:  ", len(list(         chunk(range(105),10))))
print("Chunks:       ",     list(         chunk(range(105),10)))
print("Chunks:       ",     list(map(list,chunk(range(105),10))))
print("Chunk lengths:",     list(map(len, chunk(range(105),10))))

print("Testing...")
for iterable_length in range(100):
    for chunk_count in range(1,100):
        chunks = list(chunk(range(iterable_length),chunk_count))
        assert chunk_count == len(chunks)
        assert iterable_length == sum(map(len,chunks))
        assert all(map(lambda _:abs(len(_)-iterable_length/chunk_count)<=1,chunks))
print("Okay")

Outputs:

Chunk count:   10
Chunks:        [range(0, 11), range(11, 21), range(21, 32), range(32, 42), range(42, 53), range(53, 63), range(63, 74), range(74, 84), range(84, 95), range(95, 105)]
Chunks:        [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]
Chunk lengths: [11, 10, 11, 10, 11, 10, 11, 10, 11, 10]
Testing...
Okay

Upvotes: -2

Solal Baudoin
Solal Baudoin

Reputation: 3

def chunk_array(array : List, n: int) -> List[List]:
    chunk_size = len(array) // n 
    chunks = []
    i = 0
    while i < len(array):
        # if less than chunk_size left add the remainder to last element
        if len(array) - (i + chunk_size + 1) < 0:
            chunks[-1].append(*array[i:i + chunk_size])
            break
        else:
            chunks.append(array[i:i + chunk_size])
            i += chunk_size
    return chunks

here's my version (inspired from Max's)

Upvotes: -1

Argus Waikhom
Argus Waikhom

Reputation: 607

1>

import numpy as np

data # your array

total_length = len(data)
separate = 10
sub_array_size = total_length // separate
safe_separate = sub_array_size * separate

splited_lists = np.split(np.array(data[:safe_separate]), separate)
splited_lists[separate - 1] = np.concatenate(splited_lists[separate - 1], 
np.array(data[safe_separate:total_length]))

splited_lists # your output

2>

splited_lists = np.array_split(np.array(data), separate)

Upvotes: 1

Ali Sajjad Rizavi
Ali Sajjad Rizavi

Reputation: 4510

Let's say you want to split a list [1, 2, 3, 4, 5, 6, 7, 8] into 3 element lists

like [[1,2,3], [4, 5, 6], [7, 8]], where if the last remaining elements left are less than 3, they are grouped together.

my_list = [1, 2, 3, 4, 5, 6, 7, 8]
my_list2 = [my_list[i:i+3] for i in range(0, len(my_list), 3)]
print(my_list2)

Output: [[1,2,3], [4, 5, 6], [7, 8]]

Where length of one part is 3. Replace 3 with your own chunk size.

Upvotes: 6

Max Shawabkeh
Max Shawabkeh

Reputation: 38643

This code is broken due to rounding errors. Do not use it!!!

assert len(chunkIt([1,2,3], 10)) == 10  # fails

Here's one that could work:

def chunkIt(seq, num):
    avg = len(seq) / float(num)
    out = []
    last = 0.0

    while last < len(seq):
        out.append(seq[int(last):int(last + avg)])
        last += avg

    return out

Testing:

>>> chunkIt(range(10), 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
>>> chunkIt(range(11), 3)
[[0, 1, 2], [3, 4, 5, 6], [7, 8, 9, 10]]
>>> chunkIt(range(12), 3)
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]

Upvotes: 78

Minions
Minions

Reputation: 5477

If you don't mind that the order will be changed, I recommend you to use @job solution, otherwise, you can use this:

def chunkIt(seq, num):
    steps = int(len(seq) / float(num))
    out = []
    last = 0.0

    while last < len(seq):
        if len(seq) - (last + steps) < steps:
            until = len(seq)
            steps = len(seq) - last
        else:
            until = int(last + steps)
        out.append(seq[int(last): until])
        last += steps
return out

Upvotes: -2

emremrah
emremrah

Reputation: 1765

def evenly(l, n):
    len_ = len(l)
    split_size = len_ // n
    split_size = n if not split_size else split_size
    offsets = [i for i in range(0, len_, split_size)]
    return [l[offset:offset + split_size] for offset in offsets]

Example:

l = [a for a in range(97)] should be consist of 10 parts, each have 9 elements except the last one.

Output:

[[0, 1, 2, 3, 4, 5, 6, 7, 8],
 [9, 10, 11, 12, 13, 14, 15, 16, 17],
 [18, 19, 20, 21, 22, 23, 24, 25, 26],
 [27, 28, 29, 30, 31, 32, 33, 34, 35],
 [36, 37, 38, 39, 40, 41, 42, 43, 44],
 [45, 46, 47, 48, 49, 50, 51, 52, 53],
 [54, 55, 56, 57, 58, 59, 60, 61, 62],
 [63, 64, 65, 66, 67, 68, 69, 70, 71],
 [72, 73, 74, 75, 76, 77, 78, 79, 80],
 [81, 82, 83, 84, 85, 86, 87, 88, 89],
 [90, 91, 92, 93, 94, 95, 96]]

Upvotes: 1

&#194;ngelo Polotto
&#194;ngelo Polotto

Reputation: 9531

I tried most part of solutions, but they didn't work for my case, so I make a new function that work for most of cases and for any type of array:

import math

def chunkIt(seq, num):
    seqLen = len(seq)
    total_chunks = math.ceil(seqLen / num)
    items_per_chunk = num
    out = []
    last = 0

    while last < seqLen:
        out.append(seq[last:(last + items_per_chunk)])
        last += items_per_chunk

    return out

Upvotes: -1

Anthony Manning-Franklin
Anthony Manning-Franklin

Reputation: 5028

This one provides chunks of length <= n, >= 0

def

 chunkify(lst, n):
    num_chunks = int(math.ceil(len(lst) / float(n))) if n < len(lst) else 1
    return [lst[n*i:n*(i+1)] for i in range(num_chunks)]

for example

>>> chunkify(range(11), 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]]
>>> chunkify(range(11), 8)
[[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10]]

Upvotes: -1

Cipher.Chen
Cipher.Chen

Reputation: 43

I've written code in this case myself:

def chunk_ports(port_start, port_end, portions):
    if port_end < port_start:
        return None

    total = port_end - port_start + 1

    fractions = int(math.floor(float(total) / portions))

    results = []

    # No enough to chuck.
    if fractions < 1:
        return None

    # Reverse, so any additional items would be in the first range.
    _e = port_end
    for i in range(portions, 0, -1):
        print "i", i

        if i == 1:
            _s = port_start
        else:
            _s = _e - fractions + 1

        results.append((_s, _e))

        _e = _s - 1

    results.reverse()

    return results

divide_ports(1, 10, 9) would return

[(1, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (10, 10)]

Upvotes: -1

swateek
swateek

Reputation: 7580

#!/usr/bin/python


first_names = ['Steve', 'Jane', 'Sara', 'Mary','Jack','Bob', 'Bily', 'Boni', 'Chris','Sori', 'Will', 'Won','Li']

def chunks(l, n):
for i in range(0, len(l), n):
    # Create an index range for l of n items:
    yield l[i:i+n]

result = list(chunks(first_names, 5))
print result

Picked from this link, and this was what helped me. I had a pre-defined list.

Upvotes: 1

pylang
pylang

Reputation: 44565

See more_itertools.divide:

n = 2

[list(x) for x in mit.divide(n, range(5, 11))]
# [[5, 6, 7], [8, 9, 10]]

[list(x) for x in mit.divide(n, range(5, 12))]
# [[5, 6, 7, 8], [9, 10, 11]]

Install via > pip install more_itertools.

Upvotes: 10

Chłop Z Lasu
Chłop Z Lasu

Reputation: 131

My solution, easy to understand

def split_list(lst, n):
    splitted = []
    for i in reversed(range(1, n + 1)):
        split_point = len(lst)//i
        splitted.append(lst[:split_point])
        lst = lst[split_point:]
    return splitted

And shortest one-liner on this page(written by my girl)

def split(l, n):
    return [l[int(i*len(l)/n):int((i+1)*len(l)/n-1)] for i in range(n)]

Upvotes: 3

job
job

Reputation: 9273

As long as you don't want anything silly like continuous chunks:

>>> def chunkify(lst,n):
...     return [lst[i::n] for i in xrange(n)]
... 
>>> chunkify(range(13), 3)
[[0, 3, 6, 9, 12], [1, 4, 7, 10], [2, 5, 8, 11]]

Upvotes: 139

PM 2Ring
PM 2Ring

Reputation: 55499

Here's a generator that can handle any positive (integer) number of chunks. If the number of chunks is greater than the input list length some chunks will be empty. This algorithm alternates between short and long chunks rather than segregating them.

I've also included some code for testing the ragged_chunks function.

''' Split a list into "ragged" chunks

    The size of each chunk is either the floor or ceiling of len(seq) / chunks

    chunks can be > len(seq), in which case there will be empty chunks

    Written by PM 2Ring 2017.03.30
'''

def ragged_chunks(seq, chunks):
    size = len(seq)
    start = 0
    for i in range(1, chunks + 1):
        stop = i * size // chunks
        yield seq[start:stop]
        start = stop

# test

def test_ragged_chunks(maxsize):
    for size in range(0, maxsize):
        seq = list(range(size))
        for chunks in range(1, size + 1):
            minwidth = size // chunks
            #ceiling division
            maxwidth = -(-size // chunks)
            a = list(ragged_chunks(seq, chunks))
            sizes = [len(u) for u in a]
            deltas = all(minwidth <= u <= maxwidth for u in sizes)
            assert all((sum(a, []) == seq, sum(sizes) == size, deltas))
    return True

if test_ragged_chunks(100):
    print('ok')

We can make this slightly more efficient by exporting the multiplication into the range call, but I think the previous version is more readable (and DRYer).

def ragged_chunks(seq, chunks):
    size = len(seq)
    start = 0
    for i in range(size, size * chunks + 1, size):
        stop = i // chunks
        yield seq[start:stop]
        start = stop

Upvotes: 7

leotrubach
leotrubach

Reputation: 1597

Here is my solution:

def chunks(l, amount):
    if amount < 1:
        raise ValueError('amount must be positive integer')
    chunk_len = len(l) // amount
    leap_parts = len(l) % amount
    remainder = amount // 2  # make it symmetrical
    i = 0
    while i < len(l):
        remainder += leap_parts
        end_index = i + chunk_len
        if remainder >= amount:
            remainder -= amount
            end_index += 1
        yield l[i:end_index]
        i = end_index

Produces

    >>> list(chunks([1, 2, 3, 4, 5, 6, 7], 3))
    [[1, 2], [3, 4, 5], [6, 7]]

Upvotes: 5

Carlos del Ojo
Carlos del Ojo

Reputation: 29

You could also use:

split=lambda x,n: x if not x else [x[:n]]+[split([] if not -(len(x)-n) else x[-(len(x)-n):],n)][0]

split([1,2,3,4,5,6,7,8,9],2)

[[1, 2], [3, 4], [5, 6], [7, 8], [9]]

Upvotes: 1

galliwuzz
galliwuzz

Reputation: 369

Rounding the linspace and using it as an index is an easier solution than what amit12690 proposes.

function chunks=chunkit(array,num)

index = round(linspace(0,size(array,2),num+1));

chunks = cell(1,num);

for x = 1:num
chunks{x} = array(:,index(x)+1:index(x+1));
end
end

Upvotes: -1

MaPePeR
MaPePeR

Reputation: 983

If you divide n elements into roughly k chunks you can make n % k chunks 1 element bigger than the other chunks to distribute the extra elements.

The following code will give you the length for the chunks:

[(n // k) + (1 if i < (n % k) else 0) for i in range(k)]

Example: n=11, k=3 results in [4, 4, 3]

You can then easily calculate the start indizes for the chunks:

[i * (n // k) + min(i, n % k) for i in range(k)]

Example: n=11, k=3 results in [0, 4, 8]

Using the i+1th chunk as the boundary we get that the ith chunk of list l with len n is

l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)]

As a final step create a list from all the chunks using list comprehension:

[l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)] for i in range(k)]

Example: n=11, k=3, l=range(n) results in [range(0, 4), range(4, 8), range(8, 11)]

Upvotes: 29

liscju
liscju

Reputation: 99

Using list comprehension:

def divide_list_to_chunks(list_, n):
    return [list_[start::n] for start in range(n)]

Upvotes: 7

amit12690
amit12690

Reputation: 167

Implementation using numpy.linspace method.

Just specify the number of parts you want the array to be divided in to.The divisions will be of nearly equal size.

Example :

import numpy as np   
a=np.arange(10)
print "Input array:",a 
parts=3
i=np.linspace(np.min(a),np.max(a)+1,parts+1)
i=np.array(i,dtype='uint16') # Indices should be floats
split_arr=[]
for ind in range(i.size-1):
    split_arr.append(a[i[ind]:i[ind+1]]
print "Array split in to %d parts : "%(parts),split_arr

Gives :

Input array: [0 1 2 3 4 5 6 7 8 9]
Array split in to 3 parts :  [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8, 9])]

Upvotes: 3

Ilya Tuvschev
Ilya Tuvschev

Reputation: 179

The same as job's answer, but takes into account lists with size smaller than the number of chuncks.

def chunkify(lst,n):
    [ lst[i::n] for i in xrange(n if n < len(lst) else len(lst)) ]

if n (number of chunks) is 7 and lst (the list to divide) is [1, 2, 3] the chunks are [[0], [1], [2]] instead of [[0], [1], [2], [], [], [], []]

Upvotes: 1

Related Questions