Reputation: 33
I have this code:
def loop():
alphabet = string.digits + string.letters
for key in itertools.product(alphabet, repeat=6):
...
I am using 4 processes using this code:
if __name__ == '__main__':
jobs = []
for i in range(4):
p = multiprocessing.Process(target=loop)
jobs.append(p)
p.start()
Now.. this will just run the entire function 4 times, I need to somehow split the workload into 4 and run each process on its own, so in this case I need to split the characters I'm generating into 4 different parts.. for example:
Process 1 workload
100,101,102,103
Process 2 workload
104,105,106,107
Process 3 workload
108,109,110,111
Process 4 workload
112,113,114,115
I think you should understand what I want to do..
I tried looping through and just throwing away but it can get super slow when using a large length of characters.. If I had 1,000,000 lines and the processor name was 4, it will loop 750,000 times without doing anything and process the next 250,000, if the processor name was 3.. it would loop 500,000 times, process the next 250k and finish at 75000, so much wasted computing power though :/
Upvotes: 3
Views: 1825
Reputation: 47840
You need to divide the workload beforehand and pass it in to your function when you call Process
. Generally speaking, this can be a hard problem, but in your case it's pretty trivial since you're just generating cartesian products -- simply slice off the first character and attach it separately.
i.e. instead of generating repeat=6
, use repeat=5
and iterate through the possibilities for the first letter yourself, passing each to a separate process.
For example:
def loop(first, sequence):
for seq in sequence:
key = first + seq
....
and call it with:
alphabet = ...
for letter in alphabet:
p = Process(target=loop, args=(letter, itertools.product(alphabet, repeat=5))
# etc.
This will spawn one process per letter in your alphabet; you could do exactly four splits or other things like that by passing ranges for the first character, too.
Upvotes: 1
Reputation: 56720
It sounds like each task only requires a small amount of data, so try using multiprocessing.Pool
to create a pool of workers. It will start a pool of worker processes, and send a chunk of items to each worker. Use something like imap_unordered
to map all the input combinations to their results.
Upvotes: 0