Leockl
Leockl

Reputation: 2156

What can I do when I keep exceeding memory used while using Dask-ML

I am using Dask-ML to run some code which uses quite a bit of RAM memory during training. The training dataset itself is not large but it's during training which uses a fair bit of RAM memory. I keep getting the following error message, even though I have tried using different values for n_jobs:

distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting

What can I do?

Ps: I have also tried using Kaggle Kernel (which allows up to 16GB RAM) and this didn't work. So I am trying Dask-ML now. I am also just connected to the Dask cluster using its default parameter values, with the code below:

from dask.distributed import Client
import joblib

client = Client()

with joblib.parallel_backend('dask'):
    # My own codes

Upvotes: 0

Views: 622

Answers (1)

quasiben
quasiben

Reputation: 1464

Dask has a detailed page on techniques to help with memory management. You might also be interested in configuring spilling to disk Dask workers. For example, rather

Upvotes: 1

Related Questions