Aran Freel
Aran Freel

Reputation: 3215

Concurrently write pandas DataFrame to xlsx

I have one DataFrame which needs to be segmented and written to different excel files based on a specific column...

df = pd.DataFrame(np.arange(28).reshape((7, 4)))

df['group'] = ['a', 'a', 'c', 'c', 'd', 'd', 'e']


    0   1   2   3   group
0   0   1   2   3   a
1   4   5   6   7   a
2   8   9   10  11  c
3   12  13  14  15  c
4   16  17  18  19  d
5   20  21  22  23  d
6   24  25  26  27  e

based on the column group I need to split and write to xlsx files... I currently use

for group in list(df['group'].unique()):
    group_df = df[df['group'] == group]
    group_df.to_excel(some_path)

Is there a way I could do this concurrently, rather than with a for loop??

Upvotes: 2

Views: 403

Answers (1)

Jacobm001
Jacobm001

Reputation: 4539

Sort of. You would still need a for loop to break out into the individual threads.

That being said, you won't see any performance gains from using concurrency here. You have no external blocking APIs, and your operation is certainly IO limited as opposed to CPU.

Upvotes: 1

Related Questions