Reputation: 96
I have a dask Series X
filled with strings containing a lot of text that I want to split it into columns. This is what I was doing:
cols = 2867847
W = X.str.split(n=cols, expand=True) #X has 3320 lines and npartitions=1000
I can't simply increase the number of partitions to account for the column sizer because dask partitions the DataFrame line-wise. Is it possible to make partitions over the columns instead?
Upvotes: 0
Views: 138
Reputation: 57281
It is odd to use Pandas style dataframes with thousands of columns. Perhaps there is some other API that would suit your situation better? Maybe dask.delayed or dask.bag or xarray?
Upvotes: 1