Reputation: 391
I have code which looks something like this
import numpy as np
A = np.zeros((10000, 10))
for i in range(10000):
# Some time-consuming calculations which result in a 10 element 1D array 'a'
A[i, :] = a
How can I parallelize the for
loop, so that the array A
is filled out in parallel? My understanding is that multiple processes normally shouldn't be writing to the same variable, so it's not clear to me what the proper way of doing this is.
Upvotes: 2
Views: 2079
Reputation: 973
Why not use the GPU's ability to do it in parallel:
import numpy as np
from numba import vectorize
@vectorize(["float32(float32, float32)"], target='cuda')
def f(a,b):
for i in range(10000):
a=4*b
return a
a = np.ones((10000,10), dtype=np.float32)
b = np.random.rand(10000,10).astype('f')
f(a,b)
Upvotes: 0
Reputation: 485
This below code creates a thread for each line of the array, not sure how efficient it is though.
import numpy as np
import threading
def thread_function(index, array):
# aforementioned time-consuming calculation, resulting in 'a'
a = np.ones(10) # placeholder for calculation
array[index, :] = a
if __name__ == "__main__":
A = np.zeros((10000, 10))
threads = []
for i in range(10000):
threads.append(threading.Thread(target=thread_function, args=(i, A)))
threads[i].start()
for i in range(10000):
threads[i].join()
print(A)
https://repl.it/@RobertClarke64/Python-Multithreading
As it is it's not particularly fast, but will hopefully be noticeably faster than running each calculation in series when the calculation takes a long time.
Upvotes: 2