Jorge Silva
Jorge Silva

Reputation: 245

Multiprocessing python not running in parallel

I have been trying to use multiprocessing module from python to achieve parallism on a task that is computationally expensive.

I'm able to execute my code, however it doesn't run in parallel. I have been reading multiprocessing's manual page and foruns to find out why it isn't working and i haven't figured it out yet.

I think that the problem may be related with some kinda of lock on executing other modules that i created and imported.

Here is my code:

##import my modules
import prepare_data
import filter_part
import wrapper_part
import utils
from myClasses import ML_set
from myClasses import data_instance

n_proc = 5

def main():
    if __name__ == '__main__':
        ##only main process should run this
        data = prepare_data.import_data() ##read data from file  
        data = prepare_data.remove_and_correct_outliers(data)
        data = prepare_data.normalize_data_range(data)
        features = filter_part.filter_features(data)

        start_t = time.time()
        ##parallelism will be used on this part
        best_subset = wrapper_part.wrapper(n_proc, data, features)

        print time.time() - start_t


##my modules
from myClasses import ML_set
from myClasses import data_instance
import utils

def wrapper(n_proc, data, features):

    p_work_list = utils.divide_features(n_proc-1, features)
    n_train, n_test = utils.divide_data(data)

    workers = []

    for i in range(0,n_proc-1):
        print "sending process:", i
        p = mp.Process(target=worker_classification, args=(i, p_work_list[i], data, features, n_train, n_test))

    for worker in workers:
        print "waiting for join from worker"


def worker_classification(id, work_list, data, features, n_train, n_test):
    print "Worker ", id, " starting..."
    best_acc = 0
    best_subset = []
    while (work_list != []):
        test_subset = work_list[0]
        train_set, test_set = utils.cut_dataset(n_train, n_test, data, test_subset)
        _, acc = classification_decision_tree(train_set, test_set)
        if acc > best_acc:
            best_acc = acc
            best_subset = test_subset
    print id, " found best subset ->  ", best_subset, " with accuracy: ", best_acc

All the other modules dont use the multiprocessing module and work fine. At this stage i'm just testing paralelism, not even trying to get the results back, thus there isn't any communication between processes nor shared memory variables. Some variables are used by every process, however they are defined before spawning the processes so as far as my knowledge goes, i believe each process has its own copy of the variable.

As output for 5 processes i get this:

importing data from file...
sending process: 0
sending process: 1
Worker  0  starting...
0  found best subset ->   [2313]  with accuracy:  60.41
sending process: 2
Worker  1  starting...
1  found best subset ->   [3055]  with accuracy:  60.75
sending process: 3
Worker  2  starting...
2  found best subset ->   [3977]  with accuracy:  62.8
waiting for join from worker
waiting for join from worker
waiting for join from worker
waiting for join from worker
Worker  3  starting...
3  found best subset ->   [5770]  with accuracy:  60.07

It took around 55 seconds for 4 processes to execute the parallel part. Testing this with only 1 process the execution time is 16 seconds:

importing data from file...
sending process: 0
waiting for join from worker
Worker  0  starting...
0  found best subset ->   [5870]  with accuracy:  63.32

Im running this on python 2.7 and windows 8


I tested my code on ubuntu and it worked, i guess its something wrong with windows 8 and python. Here is the output on ubuntu:

importing data from file...
size trainset:  792  size testset:  302
sending process: 0
sending process: 1
Worker  0  starting...
sending process: 2
Worker  1  starting...
sending process: 3
Worker  2  starting...
waiting for join from worker
Worker  3  starting...
2  found best subset ->   [5199]  with accuracy:  60.93
1  found best subset ->   [3198]  with accuracy:  60.93
0  found best subset ->   [1657]  with accuracy:  61.26
waiting for join from worker
waiting for join from worker
waiting for join from worker
3  found best subset ->   [5985]  with accuracy:  62.25

I'll start using ubuntu to test from now on, however i would like to know why the code doesn't work on windows.

Upvotes: 10

Views: 14294

Answers (1)

Dr. Jan-Philip Gehrcke
Dr. Jan-Philip Gehrcke

Reputation: 35836

Make sure to read the Windows guidelines in the multiprocessing manual:

Especially "Safe importing of main module":

Instead one should protect the “entry point” of the program by using if __name__ == '__main__': as follows:

You violated this rule within the first code snippet shown above, so I did not look further than this. Hopefully the solution to the problems you observe is as simple as including this protection.

The reason why this is important: on Unix-like systems, child processes are created by forking. In this case, the operating system creates an exact copy of the process that creates the fork. That is, all state is inherited from the parent by the child. For instance, this means that all functions and classes are defined.

On Windows, there is no such system call. Python needs to perform the quite heavy task of creating a fresh Python interpreter session in the child, and re-create (step by step) the state of the parent. For instance, all functions and classes need to be defined again. That is why heavy import machinery is going on under the hood of a Python multiprocessing child on Windows. This machinery starts when the child imports the main module. In your case, this implicates a call to main() in the child! For sure, you do not want that.

You might find this tedious. I find impressive that the multiprocessing module manages to provide an interface for same functionality for two so very different platforms. Really, with respect to process handling, POSIX-compliant operating systems and Windows are so different, that it is inherently difficult to come up with an abstraction that works on both.

Upvotes: 3

Related Questions