Introducing delay in python multiprocessing

Question

I have the following code:

from multiprocessing import Pool
import pandas as pd

def f(x):
    data = pd.read_sql(query[x], conn) #query and conn are particular to my PC so no point in pasting it here
    #do large math operations here
    return answer

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

I have 8 processors on my PC. Right now all processors access the database through conn simultaneously which is causing some problems on database end.

How do I change the above code so that access to database is done one at a time. The moment database access by one processor is finished another processor is free to access the database again. The processor which has finished the database access should continue with doing the math operations. Basically, I am trying to make sure that database access is not simultaneous but database access code stays within the multiprocessing framework. As a last resort I can try to read the data before I process them but I was looking to see whether I can do it without changing the existing code.

user3657941 · Accepted Answer

Protect access to your database with a multiprocessing.Lock:

from multiprocessing import Pool, Lock
import pandas as pd

conn_lock = Lock()

def f(x):
    with conn_lock:
        data = pd.read_sql(query[x], conn)
    #do large math operations here
    return answer

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

Introducing delay in python multiprocessing

Answers (1)

Related Questions