Leighton Ritchie
Leighton Ritchie

Reputation: 501

Jupyter Notebook/Lab set current directory to ipynb file's

Desired behaviour

We have an existing workflow in vanilla Jupyter Notebook/Lab where we use relative paths to store outputs of some notebooks. Example:

In both notebooks, we produce the output by simply writing to ./output.log or so.

Problem

However, we are now trying Google Dataproc with Jupyter optional component, and the current directory is always / regardless of which notebook it's run from. This applies for both the notebook and Lab interfaces.

What I've tried

Disabling c.FileContentsManager.root_dir='/' in /etc/jupyter/jupyter_notebook_config.py causes the current directory to be set to wherever I started jupyter notebook from, but it is always that initial starting folder instead of following the .ipynb notebook files.

Any idea on how to restore the "dynamic" current directory behaviour?

Even if it's not possible, I'd like to understand how Dataproc even makes Jupyter behave differently.

Details

Upvotes: 2

Views: 2290

Answers (2)

Eben du Toit
Eben du Toit

Reputation: 466

Definitely a general solution for most use-cases seems to be what is described right here in the github issue: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020

Upvotes: 0

Sayan Bhattacharya
Sayan Bhattacharya

Reputation: 1368

No it is not possible to always get the current directory where your .ipynb file is. Jupyter is running from the local filesystem of the master node of your cluster. It will always take the default system path for its kernel.

In other cases(besides dataproc) also it is not possible to consistently get the path of a Jupyter notebook. You can check out this thread regarding this topic.

You have to mention the directory path for your log file to be saved in the desired path.

Note that the GCS folder in your Lab refers to the Google Cloud storage Bucket of your cluster. You can create .ipynb in GCS but when you will execute the file it will be running inside the local system.Thus you will not be able to save log files in GCS directly.


EDIT:

It's not only Dataproc who makes Jupyter behave differently.If you use Google Colab notebooks there you will also see the same behaviour.

The reason is because youre always executing code in the kernel does not matter where the file is. And in theory multiple notebooks could connect to that kernel.Thus you can't have multiple working directories for the same kernel.

As I mentioned earlier by default if you're starting a notebook, the current working directory is set to the path of the notebook.

Link to the main thread -> https://github.com/ipython/ipython/issues/10123

Upvotes: 1

Related Questions