SantoshGupta7
SantoshGupta7

Reputation: 6197

How to make Dask process fewer partitions/files at a time?

I am trying to use to_parquet but it crashes my system due to memory error. I've discovered it's trying to save 100-300 of my partitions at a time.

Is it possible to somehow specify that I want fewer partitions processed at a time in order to prevent a crash due to using up all the RAM?

Upvotes: 0

Views: 38

Answers (1)

MRocklin
MRocklin

Reputation: 57281

Dask will use as many threads at a time as you give it. The tasks may be "processing" but that just means that they have been sent to a worker, which will handle them when it has a spare thread.

I am trying to use to_parquet but it crashes my system due to memory error.

However it could still be that your partitions are large enough that you can't fit several of them in memory at once. In this case you might want to select a smaller partition size. See https://docs.dask.org/en/latest/best-practices.html#avoid-very-large-partitions for more information.

Upvotes: 1

Related Questions