Creating batches with itertools.islice

Question

I am trying to use the itertools.combinations and itertools.slice functions in order to create a number of batches on which computations can be performed in parallel. I use the following function to create my batches:

def construct_batches(n,k,batch_size):

    combinations_slices = []

    # Calculate number of batches
    n_batches = math.ceil(comb(n,k,exact=True)/batch_size)

    # Construct iterator for combinations
    combinations = itertools.combinations(range(n),k)

    while len(combinations_slices) < n_batches:
        combinations_slices.append(itertools.islice(combinations,batch_size))

    return combinations_slices

After performing some computations, I find out which batches and elements are relevant. So I have a list of batches (e.g. batches = [2,3,1]) and a list of elements (e.g. elements = [5,7,0]). To my amazement/horror python has the following behaviour. Suppose I want to check if my slices are correct. Then

combinations_slices = construct_batches(n,k,batch_size)

list(combinations_slices[0])
Out[491]: 
[(0, 1, 2, 3),
 (0, 1, 2, 4),
 (0, 1, 2, 5),
 (0, 1, 2, 6),
 (0, 1, 2, 7),
 (0, 1, 2, 8),
 (0, 1, 2, 9),
 (0, 1, 3, 4),
 (0, 1, 3, 5),
 (0, 1, 3, 6)]

list(combinations_slices[1])
Out[492]: 
[(0, 1, 3, 7),
 (0, 1, 3, 8),
 (0, 1, 3, 9),
 (0, 1, 4, 5),
 (0, 1, 4, 6),
 (0, 1, 4, 7),
 (0, 1, 4, 8),
 (0, 1, 4, 9),
 (0, 1, 5, 6),
 (0, 1, 5, 7)]

This is all nice and jolly, shows the approach has worked. However, if I use a list comprehension to select the "relevant" batches as combinations_slices = [combinations_slices[i] for i in range(len(combinations_slices)) if i in batches], then the output, is (sadly):

combinations_slices = construct_batches(n,k,batch_size)

batches = [2,3,1]

combinations_slices = [combinations_slices[i] for i in range(len(combinations_slices)) if i in batches]

list(combinations_slices[0])
Out[509]: 
[(0, 1, 2, 3),
 (0, 1, 2, 4),
 (0, 1, 2, 5),
 (0, 1, 2, 6),
 (0, 1, 2, 7),
 (0, 1, 2, 8),
 (0, 1, 2, 9),
 (0, 1, 3, 4),
 (0, 1, 3, 5),
 (0, 1, 3, 6)]

list(combinations_slices[1])
Out[510]: 
[(0, 1, 3, 7),
 (0, 1, 3, 8),
 (0, 1, 3, 9),
 (0, 1, 4, 5),
 (0, 1, 4, 6),
 (0, 1, 4, 7),
 (0, 1, 4, 8),
 (0, 1, 4, 9),
 (0, 1, 5, 6),
 (0, 1, 5, 7)]

Is there any way to obtain the desired behaviour without casting everything to lists (in general these lists of combinations could be large so I would run out of memory...)? Suggestions appreciated...

Creating batches with itertools.islice

Answers (1)

Related Questions