Implementation of memory-efficient ThreadPool for large-scale itertools product

Question

I have a dictionary of 15 variables, each with 3 values, for which I need to generate a product of all possible combinations (3**15 = 14.3M combinations). I'm using multi-threading with a 12 core processor to process the combinations (likely jumping to 64 cores).

I'm using itertools.product to generate the different combinations, and ThreadPool with imap_unordered to run the multiprocessing. Additionally, I'm using deque to remove the result as soon as it's available. However, I'm finding that the memory consumption is blowing up to about 2.5GB. I understand that the itertools.product is an iterable and therefore should not be storing much data in memory, but that doesn't seem to be the case.

Below is my code, and I'm wondering if anyone can help me figure out how I can better optimize the memory utilization.

Additionally, I'm wondering how the chunk size in the imap_unordered plays a role in memory efficiency. I was trying different numbers to see how it effects memory usage (including 10, 100, 1000, 10000) but it doesn't seem to impact much other than stabilizing the memory utilization at around 2.5GB. If I don't include the chunk size, memory tends to blow up >5GB.

I also tried changing the number of threads from 12 to 1, and that also did not impact the memory usage. However, using the single-processor implementation (commented out below) reduces the memory usage to only ~30MB.

import numpy as np
import itertools
import multiprocessing
import queue
import functools
from multiprocessing import pool, dummy

def dummy_func(values, keys):
    print( dict(zip(keys, values)) )
    return

def main():
    num_threads = multiprocessing.cpu_count()

    parameters = {'a': ['7.0', '9.0', '11.0'], 'b': ['125p', '200p', '275p'], 
                  'c': ['320n', '440n', '560n'], 'd': ['400p', '500p', '600p'], 
                  'e': ['262p', '374p', '486p'], 'f': ['13p', '25p', '37p'], 
                  'g': ['19p', '40p', '61p'], 'h': ['7p', '16p', '22p'], 
                  'i': ['7p', '16p', '22p'], 
                  'j': ['0.7200000000000004', '1.1500000000000008', '1.5700000000000012'], 
                  'k': ['4', '8', '11'], 'l': ['41', '77', '113'], 'm': ['4', '8', '11'], 
                  'n': ['16p', '31p', '46p'], 'o': ['20n', '30n', '35n']}
    keys = list(parameters)

    # process simulations for all permutations using single process
    #for values in itertools.product(*map(parameters.get, keys)):
    #    dummy_func(values, keys)

    # process simulations for all permutations using multi-threading
    with multiprocessing.pool.ThreadPool(num_threads) as workers:
        queue.deque(workers.imap_unordered(functools.partial(dummy_func, keys=keys), 
                                           itertools.product(*map(parameters.get, keys)), 100))
    return

if __name__ == "__main__":
    main()

Implementation of memory-efficient ThreadPool for large-scale itertools product

Answers (1)

Related Questions