Reputation: 9363
I am writing a Python script (in Python 2.7
) wherein I need to generate around 500,000 uniform random numbers within a range. I need to do this 4 times, perform some calculations on them and write out the 4 files.
At the moment I am doing: (this is just part of my for loop, not the entire code)
random_RA = []
for i in xrange(500000):
random_RA.append(np.random.uniform(6.061,6.505)) # FINAL RANDOM RA
random_dec = []
for i in xrange(500000):
random_dec.append(np.random.uniform(min(data_dec_1),max(data_dec_1))) # FINAL RANDOM 'dec'
to generate the random numbers within the range. I am running Ubuntu 14.04
and when I run the program I also open my system manager to see how the 8 CPU's I have are working. I seem to notice that when the program is running, only 1 of the 8 CPU's seem to work at 100% efficiency. So the entire program takes me around 45 minutes to complete.
I noticed that it is possible to use all the CPU's to my advantage using the module Multiprocessing
I would like to know if this is enough in my example:
random_RA = []
for i in xrange(500000):
multiprocessing.Process()
random_RA.append(np.random.uniform(6.061,6.505)) # FINAL RANDOM RA
i.e adding just the line multiprocessing.Process()
, would that be enough?
Upvotes: 0
Views: 162
Reputation: 2320
To get you started:
import multiprocessing
import random
def worker(i):
random.uniform(1,100000)
print i,'done'
if __name__ == "__main__":
for i in range(4):
t = multiprocessing.Process(target = worker, args=(i,))
t.start()
print 'All the processes have been started.'
You must gate the t = multiprocess.Process(...)
with __name__ == "__main__"
as each worker calls this program (module) again to find out what it has to do. If the gating didn't happen it would spawn more processes ...
Just for completeness, generating 500000 random numbers is not going to take you 45 minutes so i assume there are some intensive calculations going on here: you may want to look at them closely.
Upvotes: 0
Reputation: 7099
If you use multiprocessing, you should avoid shared state (like your random_RA
list) as much as possible.
Instead, try to use a Pool
and its map
method:
from multiprocessing import Pool, cpu_count
def generate_random_ra(x):
return np.random.uniform(6.061, 6.505)
def generate_random_dec(x):
return np.random.uniform(min(data_dec_1), max(data_dec_1))
pool = Pool(cpu_count())
random_RA = pool.map(generate_random_ra, xrange(500000))
random_dec = pool.map(generate_random_dec, xrange(500000))
Upvotes: 1