Wiseman2022
Wiseman2022

Reputation: 7

How do i optimise a Python loop parallelize/multiprocessing

I have to loop for N times to calculate formulas and add results in dataframe. My code works and takes a few seconds to process each Item. However, it can only do one item at a time because I'm running the array through a for loop:

I try to update Code and I add numba library to optimise code

def calculationResults(myconfig,df_results,isvalid,dimension,....othersparams):
    for month in nb.prange(0, myconfig.len_production):   
        calculationbymonth(month,df_results,,....othersparams)
    return df_results

But it's still doing one item at a time? ANy Ideas?

Upvotes: 0

Views: 155

Answers (1)

Smaurya
Smaurya

Reputation: 187

We can use parallelized apply using the similar to below function.

def parallelize_dataframe(df, func, n_cores=4):
    df_split = np.array_split(df, n_cores)
    pool = Pool(n_cores)
    df = pd.concat(pool.map(func, df_split))
    pool.close()
    pool.join()
    return df

Upvotes: 1

Related Questions