user2578185
user2578185

Reputation: 437

Multiprocessing in Python - how to use it in these loops?

The code below handles a huge amount of data and I want to ask how can I use the multiprocessing module in Python for parallel processing in order to speed things up. Any help is appreciated

pats = []
for chunk in code_counter(patients, codes):
    pats.append(chunk)

def code_counter(patients, codes):
    for key, group in itertools.groupby(patients, key=operator.itemgetter('ID')):
        group_codes = [item['CODE'] for item in group]
        yield [group_codes.count(code) for code in codes]

Upvotes: 0

Views: 73

Answers (1)

dustin.b
dustin.b

Reputation: 1275

I think your problem resides in the use of yield. I think you can't yield the data from different processes. I understood, that you use the yield cuz you can't load the data "inline" that would cause the ram to overload.

maybe you can take a look at the multiprocessing Queue http://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes

i didn't really get what you are trying to do with your code, so i can't deliver a precise excample.

Upvotes: 1

Related Questions