Denis Kuzin
Denis Kuzin

Reputation: 903

how to shuffle a large list of items in parallel, python

I have the bottleneck in my calculations on python: I need to shuffle a large list (~10^9 elements). Current implementation:

import random
random.shuffle(list)

With this method, only one core is involved. Is it possible to shuffle a large list in parallel?

Upvotes: 1

Views: 873

Answers (1)

pylearner
pylearner

Reputation: 537

You may check the Process class, which can be exemplified in this way:

import random
from multiprocessing import Process


def worker_func(variable_1):
# your code

random.shuffle(list_single)

if __name__ == '__main__':
#Create a process list 
process_list = list()

pid = os.getpid()
print('Main Process is started and PID is: ' + str(pid))

#Start Process
list_example = [[1,2,3], [4, 5, 6], [7, 8, 9]]
for list_single in list_example:
    p = Process(target=worker_func, args=(list_single, ))
    p.start()
    child_pid = str(p.pid)
    print('PID is:' + child_pid)
    process_list.append(child_pid)
    child = multiprocessing.active_children()

while child != []:
    time.sleep(1)
    child = multiprocessing.active_children()

If you want to run this data in parallel, you can use multithreading or multiprocessing. You should define your worker function and call it in the process.

Upvotes: 2

Related Questions