Reputation: 799
I have a function f(df, x)
where df
is a large dataframe and x
is a simple variable. The function f
only read from df
and doesn't modify it. Is it possible to share the memory of df
and not copying it to sub-processes when using joblib.Parallel
or other multiprocessing
module?
df
into a global variable, as I'd like to reuse the code to process other data.df
into numpy array, as f
needs to locate data using index of df
.Edit:
Will df
be copied to sub-process while executing Parallel
in the following code?
def g(df):
def f(x):
nonlocal df
...
return z
list_res = Parallel(10)(delayed(f)(x) for x in iterables)
return list_res
Upvotes: 0
Views: 24