Nguai al
Nguai al

Reputation: 958

Jupyter Lab instance crashes with 502 error

I am using a JupyterLab virtual notebook instance from GCP Vertex AI Workbench.

I am reading 2 billion rows of data where each row is comprised of 3 columns of 8 bytes each.

I am reading 100 million rows of data at a time and concatenating it to Pandas dataframe.

All of sudden, the notebook becomes unresponsive with 502 error.

I realize that the virtual machine crashed.

Here is the spec to the virtual machine: n1-standard 64 240GB RAM 100 GB drive

One time, I was successful to reach 2 billion rows. But all of sudden, to my dismay, it crashed with that error.

Google doc just mentions to restart the kernel. That is not so easy when it took more than 1 hour to read 2 billion rows of data. This means more than 1 hour of work just got wasted.

What is causing this error? Why the error occurs so inconsistently? Where is the error message for this to crash? Or is this an error related to pandas dataframe? I am creating a dataframe that have 2 billion rows. If pandas cannot handle rows of this magnitude, it should simply cause a run time error, not crashing a virtual machine.

Thanks in advance

Upvotes: 2

Views: 1290

Answers (1)

Jose Gutierrez Paliza
Jose Gutierrez Paliza

Reputation: 1428

This error happens because the code runs into ports overlaps. It is supposed to be fixed since the part of the code that stops the kernel it is changed in github The change was replacing restart_kernel to shutdown_kernel.

We also need to be sure that the container is cleaned up when shutting down the kernel.

For this you can follow these steps:

  • Create a notebook
  • Runs a few cells
  • Kill the kernel
  • Start a new kernel

Upvotes: 1

Related Questions