Callmeat911 True
Callmeat911 True

Reputation: 203

Why does my google colab session keep crashing?

I am using google colab on a dataset with 4 million rows and 29 columns. When I run the statement sns.heatmap(dataset.isnull()) it runs for some time but after a while the session crashes and the instance restarts. It has been happening a lot and I till now haven't really seen an output. What can be the possible reason ? Is the data/calculation too much ? What can I do ?

Upvotes: 20

Views: 60676

Answers (8)

Isaac Kobby Anni
Isaac Kobby Anni

Reputation: 1

I only changed the runtime to TPU and it worked perfectly for me. If dealing with a dense graph data especially with language model embedding for your data.x features, it increases the data size and TPU has the best RAM capacity to rescue you.

Upvotes: 0

Sasidhar Nandikolla
Sasidhar Nandikolla

Reputation: 1

The common cause is an out-of-memory error, Possible reasons maybe you specified a larger batch size while training your model try to reduce the batch size

Upvotes: 0

Amadeo Amadei
Amadeo Amadei

Reputation: 21

What worked for me was to click on the RAM/Disk Resources drop down menu, then 'Manage Sessions' and terminate my current session which had been active for days. Then reconnect and run everything again.

Before that, my code kept crashing even though it was working perfectly the previous day, so I knew there was nothing wrong coding wise.

After doing this, I also realized that the parameter n_jobs in GridSearchCV (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) plays a massive role in GPU RAM consumption. For example, for me it works fine and execution doesn't crash if n_jobs is set to None, 1 (same as None), or 2. Setting it to -1 (using all processors) or >3 crashes everything.

Upvotes: 1

RAP
RAP

Reputation: 11

I would first suggest closing your browser and restarting the notebook. Look at the run time logs and check to see if cuda is mentioned anywhere. If not then do a factory runtime reset and run the notebook. Check your logs again and you should find cuda somewhere there.

Upvotes: 1

Muhammad Talha
Muhammad Talha

Reputation: 832

This error mostly comes if you enable the GPU but do not using it. Change your runtime type to "None". You will not face this issue again.

Upvotes: 2

yoavs
yoavs

Reputation: 5

For me, passing specific arguments to the tfms augmentation failed the dataloader and crahed the session. Wasted lot of time checking the images not coruppt and clean the gc and more...

Upvotes: 0

Sam
Sam

Reputation: 391

I'm not sure what is causing your specific crash, but a common cause is an out-of-memory error. It sounds like you're working with a large enough dataset that this is probable. You might try working with a subset of the dataset and see if the error recurs.

Otherwise, CoLab keeps logs in /var/log/colab-jupyter.log. You may be able to get more insight into what is going on by printing its contents. Either run:

!cat /var/log/colab-jupyter.log

Or, to get the messages alone (easier to read):

import json

with open("/var/log/colab-jupyter.log", "r") as fo:
  for line in fo:
    print(json.loads(line)['msg'])

Upvotes: 23

user1114
user1114

Reputation: 1169

Another cause - if you're using PyTorch and assign your model to the GPU, but don't assign an internal tensor to the GPU (e.g. a hidden layer).

Upvotes: 5

Related Questions