Reputation: 621
So, here's my situation.
I'm using PyDev in Eclipse, Python interpreter version 2.7.2 in Windows.
I'm using the built in multiprocessing library in an attempt to fork off a bunch of processes to parallelize a very compute-intensive loop. The tutorials I've looked at say to use,
if __name__ == "__main__":
to prevent it from spawning off near-infinite processes and bringing my system to its knees, essentially.
The problem is, I am calling this from a module, not my main script; as such, nothing after it EVER gets executed. No chance for parallelism at all. Of course, if I remove it, I get the infiniprocess spam that kills the machine executing the code.
For reference's sake, here's the relevant code:
from tribe import DataCache
from tribe import WorldThread
from tribe import Actor
from time import sleep
import multiprocessing
class World:
def __init__(self,numThreads,numActors,tickRate):
print "Initalizing world..."
self.cache = DataCache.DataCache()
self.numThreads = numThreads
self.numActors = numActors
self.tickRate = tickRate
self.actors = []
self.processes = []
for i in range(numActors):
self.actors.append(Actor.Actor("test.xml",self.cache))
print "Actors loaded."
def start_world(self):
print "Starting world"
run_world = True;
while run_world:
self.world_tick()
sleep(2)
def world_tick(self):
if __name__ == '__main__':
print "World tick"
actor_chunk = len(self.actors)/self.numThreads
if len(self.processes)==0:
for _ in range(self.numThreads):
new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))
And the class it is calling:
class WorldProcess():
def __init__(self):
print "World process initilized."
''' Really, I'm not sure what kind of setup we'll be doing here yet. '''
def work(self, process_number, actors):
print "World process" + str(process_number) + " running."
for actor in actors:
actor.tick()
print "World process" + str(process_number) + " completed."
Am I correct in my assessment that the whole if name == "main": check only works if you have it in the executable script itself? If so, how do you safely fork off processes from within modules? If not, why isn't it working here?
Upvotes: 4
Views: 4267
Reputation: 2618
To control the amount of processes, use the Pool
class from multiprocessing
:
from multiprocessing import Pool
p = Pool(5)
def f(x):
return x*x
p.map(f, [1,2,3])
(Edit: as per comment, this is just howto for the Pool class. see more)
Using __name__
is not required, since you explicitly pass Process
the actual python function to run.
This:
def world_tick(self):
if __name__ == '__main__':
print "World tick"
actor_chunk = len(self.actors)/self.numThreads
if len(self.processes)==0:
for _ in range(self.numThreads):
new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))
is very bad. Simplify it.
A better pattern will be:
class WorkArgs(object):
... many attributes follow ...
def proc_work(world_thread, work_args):
world_thread.WorldProcess.work(work_args.a, work_args.b, ... etc)
p = Pool(5)
p.map(proc_work, [(world_thread, args0), (world_thread, args1), ...])
Hope this helps!
As a side note, pickling your arguments and passing them to other processes will result in importing your module. So, it is best to make sure you module doesn't preform any forking/magic/work unless it is told so (e.g, only has function/class definitions or __name__
magic, not actual code blocks).
Upvotes: 2
Reputation: 92567
Adding this as an answer, since it was in the comments:
if __name__ == "__main__"
is something you do at the root level of a script that is going to be an entry point. Its a way to only do things if the script is being executed directly.
If you have a script that is the entry point, you do the name == main. And in a module you want to multiprocess, you just loop and start your processes the same way you loop and start threads.
Upvotes: 2