Tyler
Tyler

Reputation: 155

Is it possible to use multiprocessing in a module with windows?

I'm currently going through some pre-existing code with the goal of speeding it up. There's a few places that are extremely good candidates for parallelization. Since Python has the GIL, I thought I'd use the multiprocess module.

However from my understanding the only way this will work on windows is if I call the function that needs multiple processes from the highest-level script with the if __name__=='__main__' safeguard. However, this particular program was meant to be distributed and imported as a module, so it'd be kind of clunky to have the user copy and paste that safeguard and is something I'd really like to avoid doing.

Am I out of luck or misunderstanding something as far as multiprocessing goes? Or is there any other way to do it with Windows?

Upvotes: 8

Views: 3699

Answers (4)

Federico Wagner
Federico Wagner

Reputation: 1

i've been developing an instagram images scraper so in order to get the download & save operations run faster i've implemented multiprocesing in one auxiliary module, note that this code it's inside an auxiliary module and not inside the main module.

The solution I found is adding this line:

if __name__ != '__main__':

pretty simple but it's actually working!

def multi_proces(urls, profile):
    img_saved = 0
    if __name__ != '__main__':          # line needed for the sake of getting this NOT to crash
        processes = []
        for url in urls:
            try:
                process = multiprocessing.Process(target=download_save, args=[url, profile, img_saved])
                processes.append(process)
                img_saved += 1
            except:
                continue

        for proce in processes:
            proce.start()

        for proce in processes:
            proce.join()
    return img_saved

def download_save(url, profile,img_saved):
        file = requests.get(url, allow_redirects=True)  # Download
        open(f"scraped_data\{profile}\{profile}-{img_saved}.jpg", 'wb').write(file.content)  # Save

Upvotes: 0

Julian
Julian

Reputation: 31

For everyone still searching:

inside module

from multiprocessing import Process
    
def printing(a):
    print(a)

def foo(name):
    var={"process":{}}
    if name == "__main__":
        for i in range(10):
            var["process"][i] = Process(target=printing , args=(str(i)))
            var["process"][i].start()

        for i in range(10):
            var["process"][i].join

inside main.py

import data

name = __name__

data.foo(name)

output:

>>2
>>6
>>0
>>4
>>8
>>3
>>1
>>9
>>5
>>7

I am a complete noob so please don't judge the coding OR presentation but at least it works.

Upvotes: 3

David
David

Reputation: 18271

As explained in comments, perhaps you could do something like

#client_main.py
from mylib.mpSentinel import MPSentinel

#client logic

if __name__ == "__main__":
    MPSentinel.As_master()

#mpsentinel.py

class MPSentinel(object):

    _is_master = False

@classmethod
def As_master(cls):
    cls._is_master = True

@classmethod
def Is_master(cls):
    return cls._is_master

It's not ideal in that it's effectively a singleton/global but it would work around window's lack of fork. Still you could use MPSentinel.Is_master() to use multiprocessing optionally and it should prevent Windows from process bombing.

Upvotes: 1

Roland Smith
Roland Smith

Reputation: 43495

On ms-windows, you should be able to import the main module of a program without side effects like starting a process.

When Python imports a module, it actually runs it.

So one way of doing that is in the if __name__ is '__main__' block.

Another way is to do it from within a function.

The following won't work on ms-windows:

from multiprocessing import Process

def foo():
    print('hello')

p = Process(target=foo)
p.start()

This is because it tries to start a process when importing the module.

The following example from the programming guidelines is OK:

from multiprocessing import Process, freeze_support, set_start_method

def foo():
    print('hello')

if __name__ == '__main__':
    freeze_support()
    set_start_method('spawn')
    p = Process(target=foo)
    p.start()

Because the code in the if block doesn't run when the module is imported.

But putting it in a function should also work:

from multiprocessing import Process

def foo():
    print('hello')

def bar()
    p = Process(target=foo)
    p.start()

When this module is run, it will define two new functions, not run then.

Upvotes: 1

Related Questions