Seeeccccc
Seeeccccc

Reputation: 11

Multiprocessing program - Not Executing in parallel

I am currently scrapping data from various sites. The code for the scrappers is stored in modules (x,y,z,a,b)

Where x.dump is a function which uses Files for storing the scraped data. The dump function takes a single argument 'input'. Note : All the dump functions are not same.

I am trying to run each of these dump function in parallel. The following code runs fine. But i have noticed that it still follows serial order x then y ... for execution.

Is this the correct way of going about the problem?

Are multithreading and multiprocessing the only native ways for parallel programming?

from multiprocessing import Process

import x.x as x
import y.y as y
import z.z as z
import a.a as a
import b.b as b

input = ""

f_list = [x.dump, y.dump, z.dump, a.dump, b.dump]
processes = []

for function in f_list:
        processes.append(Process(target=function, args=(input,)))

for process in processes:
        process.run()

for process in processes:
        process.join()

Upvotes: 1

Views: 123

Answers (2)

mmmmmm
mmmmmm

Reputation: 32720

You should be calling process.start() not process.run()

The start method does the work of starting the extra process and then running the run method in that process.

Python docs

Upvotes: 0

John Zwinck
John Zwinck

Reputation: 249652

That's because run() is the method to implement the task itself, you're not meant to call it from outside like that. You are supposed to call start() which spawns a new process which then calls run() in the other process and returns control to you so you can do more work (and later join()).

Upvotes: 3

Related Questions