Reputation: 478
Define the following function which adds natural numbers until where you ask.
def f(x):
lo=0
for i in range(x):
lo+=i
return(lo)
To parallel it using multiprocessing.dummy I wrote the following
from multiprocessing.dummy import Pool as ThreadPool
pool=ThreadPool(4)
def f_parallel(x1,x2,x3,x4):
listo_parallel=[x1,x2,x3,x4]
resulto_parallel=pool.map(f,listo_parallel)
return(resulto_parallel)
It works, but I don't see any reduction in time of computation. Because define the following functions which reports the computation time as well.
import time
def f_t(x):
st=time.time()
lob=f(x)
st=time.time()-st
return(lob,st)
def f_parallel_t(x1,x2,x3,x4):
listo_parallel=[x1,x2,x3,x4]
st=time.time()
resulto_parallel=pool.map(f,listo_parallel)
st=time.time()-st
return(resulto_parallel,st)
Now let's examine. for x=10**7, 9**7, 10**7-2, 10**6 normal f takes 0.53, 0.24, 0.53, 0.04 seconds. And for four of them the f_parallel takes 1.39 seconds!!!!! I expected to see 0.53 seconds because the computer I used has 4 cpus and I chose 4 in the pool. But why it's going like this?
I also tried to read the documentation of the multiprocessing library of Python 3.7 but they works only if I type the examples exactly the way they are written there. For example consider the first example in that document. If I type
from multiprocessing import Pool
Pool(4).map(f,[10**7,9**7,10**7-2,10**6])
Then nothing happens and I have to restart shell (Ctrl+F6).
And doing this pool.map is not really what I want, I want to tell Python to do f(x_i) exactly on cpu no. i. So I want to know what part of my computation is being done on which cpu at any step of my programming.
Any help or guidance will be appreciated.
For the case someone doesn't get what I really want to do with python, I am uploading screenshot from the Maple file I made right now which is doing exactly what I want to do with Python and I'm asking in this question.
Upvotes: 0
Views: 304
Reputation: 478
Thanks to @FlyingTeller and @quamrana who answered my the other question, now I know how to implement the python program to do the four computations parallel such that it takes time as much as the maximum time of the four separate computations. Here is the corrected code:
def f(x):
lo=0
for i in range(x):
lo+=i
return(lo)
from multiprocessing import Pool
def f_parallel(x1,x2,x3,x4):
with Pool(processes=4) as pool:
resulto_parallel=pool.map(f,[x1,x2,x3,x4])
return(resulto_parallel)
import time
def f_parallel_t(x1,x2,x3,x4):
st=time.time()
ans=f_parallel(x1,x2,x3,x4)
st=time.time()-st
return(ans,st)
if __name__ == '__main__':
print(f_parallel_t(10**7,10**6,10**7-2,9**7))
And the screenshot of the result when I run it:
Upvotes: 0
Reputation: 43495
In CPython, more or less the "standard" implementation, only one thread at a time can be executing Python bytecode. So using threads to speed up computations won't work in CPython.
You could use multiprocessing.Pool
instead. In general I would recommend using the Pool's imap_unordered
method instead of plain map
. The former will start yielding values as soon as they become available, while the latter returns a list after all calculations are done.
Getting to the core of your question, Python does not have a platform independant method specify on which CPU a process it starts will run. How so-called processor affinity works is very operating system dependant as you can see on the linked page. Of course you could use subprocess
to run one of the mentioned utility programs, or you could use ctypes
to execute the relevant system calls directly.
Upvotes: 3