Reputation: 4807
I have the following code:
from multiprocessing import Pool
import pandas as pd
def f(x):
data = pd.read_sql(query[x], conn) #query and conn are particular to my PC so no point in pasting it here
#do large math operations here
return answer
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))
I have 8 processors on my PC. Right now all processors access the database through conn
simultaneously which is causing some problems on database end.
How do I change the above code so that access to database is done one at a time. The moment database access by one processor is finished another processor is free to access the database again. The processor which has finished the database access should continue with doing the math operations. Basically, I am trying to make sure that database access is not simultaneous but database access code stays within the multiprocessing framework. As a last resort I can try to read the data before I process them but I was looking to see whether I can do it without changing the existing code.
Upvotes: 1
Views: 1109
Reputation:
Protect access to your database with a multiprocessing.Lock
:
from multiprocessing import Pool, Lock
import pandas as pd
conn_lock = Lock()
def f(x):
with conn_lock:
data = pd.read_sql(query[x], conn)
#do large math operations here
return answer
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))
Upvotes: 1