Reputation: 2803
How can one visualize data using matplotlib by a function in parallel? I.e. I want to create figures in parallel processes and then display them in the main process.
Here is an example:
# input data
import pandas as pd, matplotlib.pyplot as plt
df = pd.DataFrame(data={'i':['A','A','B','B'],
'x':[1.,2.,3.,4.],
'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)
# function which creates a figure from the data
def Draw(df, i):
fig = plt.figure(i)
ax = fig.gca()
df = df.loc[i,:]
ax.scatter(df['x'], df['y'])
return fig
def DrawWrapper(x): return Draw(*x)
# creating figures in parallel
from multiprocessing import Pool
poolSize = 2
with Pool(poolSize) as p:
args = [(df,'A'), (df,'B')]
figs = p.map(DrawWrapper, args)
# attempt to visualize the results
fig = plt.figure('A')
plt.show()
# FIXME: get "RuntimeError: main thread is not in main loop"
How do I transfer the figure objects from the worker processes such as to be able to show the figures in the main process?
Thank you for your help!
[EDIT:] It was suggested that the problem might be solved by this thread
Here is the corresponding code:
# input data
import pandas as pd, matplotlib.pyplot as plt
df = pd.DataFrame(data={'i':['A','A','B','B'],
'x':[1.,2.,3.,4.],
'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)
# function which creates a figure from the data
def Draw(df, i):
fig = plt.figure(i)
ax = fig.gca()
df = df.loc[i,:]
ax.scatter(df['x'], df['y'])
plt.show()
# creating figures in parallel
from multiprocessing import Process
args = [(df,'A'), (df,'B')]
for a in args:
p = Process(target=Draw, args=a)
p.start()
# FIXME: result is the same (might be even worse since I do not
# get any result which I could attempt to show):
# ...
# RuntimeError: main thread is not in main loop
# RuntimeError: main thread is not in main loop
Am I missing something?
Upvotes: 3
Views: 5494
Reputation: 339765
The linked question's answer hides the start of the code in a if __name__ == "__main__":
clause. Hence the following should work here.
import pandas as pd
import matplotlib.pyplot as plt
import multiprocessing
#multiprocessing.freeze_support() # <- may be required on windows
df = pd.DataFrame(data={'i':['A','A','B','B'],
'x':[1.,2.,3.,4.],
'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)
# function which creates a figure from the data
def Draw(df, i):
fig, ax = plt.subplots()
df = df.loc[i,:]
ax.scatter(df['x'], df['y'])
plt.show()
# creating figures in parallel
args = [(df,'A'), (df,'B')]
def multiP():
for a in args:
p = multiprocessing.Process(target=Draw, args=a)
p.start()
if __name__ == "__main__":
multiP()
Upvotes: 2