R zu
R zu

Reputation: 2074

Recalculate task under memory constraint

Each block in Dask array A leads to a block in each of Dask arrays B0, B1, ... Bn. The B arrays are subsequently saved to disk as zarr.

For each block in A, the corresponding blocks in B together exceeds the memory of a computing node. The memory can hold one block of one B array.

If I am just using a single computer to calculate the B arrays, will Dask recalculate each blocks of A to keep the live blocks of the B arrays under memory limit?

Can I hint the memory usage of each task to the scheduler?

Upvotes: 1

Views: 35

Answers (1)

MRocklin
MRocklin

Reputation: 57281

As of today Dask will never recalculate a task for memory reasons. It will store excess data to disk (assuming you are using the dask.distributed scheduler) with an LRU policy. Dask generally assumes that the result of individual tasks fits comfortably in memory.

Can I hint the memory usage of each task to the scheduler?

No.

Upvotes: 2

Related Questions