Bo Qiang
Bo Qiang

Reputation: 789

sort very large data with dask?

I need to sort a data table that is well over the size of the physical memory of the machine I am using. Pandas cannot handle it because it needs to read the entire data into memory. Can dask handle that?

Thanks!

Upvotes: 4

Views: 1507

Answers (1)

MRocklin
MRocklin

Reputation: 57301

Yes, by calling set_index on the column that you wish to sort. On a single machine it uses your hard drive intelligently for excess space.

Upvotes: 3

Related Questions