Reputation: 3215
I have one DataFrame
which needs to be segmented and written to different excel files based on a specific column...
df = pd.DataFrame(np.arange(28).reshape((7, 4)))
df['group'] = ['a', 'a', 'c', 'c', 'd', 'd', 'e']
0 1 2 3 group
0 0 1 2 3 a
1 4 5 6 7 a
2 8 9 10 11 c
3 12 13 14 15 c
4 16 17 18 19 d
5 20 21 22 23 d
6 24 25 26 27 e
based on the column group
I need to split and write to xlsx files... I currently use
for group in list(df['group'].unique()):
group_df = df[df['group'] == group]
group_df.to_excel(some_path)
Is there a way I could do this concurrently, rather than with a for loop??
Upvotes: 2
Views: 403
Reputation: 4539
Sort of. You would still need a for loop to break out into the individual threads.
That being said, you won't see any performance gains from using concurrency here. You have no external blocking APIs, and your operation is certainly IO limited as opposed to CPU.
Upvotes: 1