easythrees
easythrees

Reputation: 1650

Import different sets of modules in main and in worker processes

Is it possible, when using the multiprocess module, to get a Process to import a different library? For example:

import multiprocessing as mp
import pprint
import sys
import threading

from Foo import Moo

class Worker(mp.Process):
    def __init__(self):
        print "Worker Init"
        mp.Process.__init__(self)

    def run(self):
        print "Worker Running"
        self._static_method()

    @staticmethod
    def _static_method():
        print "I'm a-static, get it?"

class TouchWorker(threading.Thread):
    def __init__(self):
        super(TouchWorker, self).__init__(name="Touchoo" + " TouchWorker")

    def run(self):
        print "Touchoo Running"

class Parasite(mp.Process):
    def __init__(self):
        print "Parasite Init"
        mp.Process.__init__(self)

    def run(self):
        print "Parasite Running"

class Encapsulator(object):
    def __init__(self):
        workers = []

        for _ in range(4):
            wrk = Worker()
            workers.append(wrk)

        for someWorker in workers:
            someWorker.start()

        par = Parasite()
        par.start()

if __name__ == '__main__':
    enc = Encapsulator()

I only really need the 'Foo' module in the 'Worker' and 'Parasite' processes. Is it possible to get them to import that module when they run?

Upvotes: 1

Views: 1061

Answers (2)

ivan_pozdeev
ivan_pozdeev

Reputation: 36036

To spawn child processes, multiprocessing uses fork() in UNIX and running the program with special parameters in Windows that invokes special code trying to emulate the same behavior.

So, when your child processes are created, they aren't actually initialized again, all the modules that the parent loaded are already loaded for them, too.

So, if you wish to import a module:

  • in master but not in workers:
    • not possible, and there's completely no need to. All you can do is to make the variables referencing the modules not visible to workers somehow
  • in worker(s) but not in master:
    • import it inside worker function(s)
      • the import will be done in each worker, or
    • import it in master
      • import will be done once, in master, and children will automatically inherit it, or
    • import it in master, then del the resulting variable (so that it doesn't pollute the master's namespace), then import in workers again (that will reuse the existing module object from sys.modules)

Upvotes: 1

Aaron
Aaron

Reputation: 11075

Simply reverse the idiom needed to prevent infinite loops of process creation..

# this should look familiar
if not __name__ == "__main__":
     from Foo import Moo

You may find however it's easier to make your library load faster and just do it in the main file in order to avoid all sorts of ridiculous scoping issues. This may be realized by requiring a separate Moo.initialize() call once the subprocess starts, but it would need to be executed by each child process because memory is not shared.

A good general rule of thumb is that libraries should do no actual work on import so that they are loaded quickly. Once you call a function or class from said library, the necessary work is then performed.

Upvotes: 1

Related Questions