Reputation: 176
I know there are many questions on this topic but none of them helped me much.
I have a python project directory i.e. git_project
(git repository). What I want is to create a separate directory called notebooks
where I will keep all my notebooks for analysis using git_project. I don't want to put notebooks within the root of git_project. I have kept both git_project
and notebooks
directory in a general directory where keep all of my projects. I have the following structure:
my_projects
│
├── notebooks
│ └── notebook.ipynb
└── git_project
└── config
└── cfg.json
└── source
└── config.py
The contents of config.py
:
import json
def get_cfg():
with open('config/cfg.json', 'r') as f:
cfg = json.load(f)
return cfg
Contents of the notebook.ipynb
:
import sys
import os
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
sys.path.append(module_path)
from git_project.source.config import get_cfg
get_cfg()
Now when I run the code in notebook.ipynb I get the following error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-7-6796ee7f0100> in <module>
----> 1 get_cfg()
~/Documents/my_projects/git_project/source/config.py in get_cfg()
1 def get_cfg():
----> 2 with open('config/cfg.json', 'r') as f:
3 cfg = json.load(f)
4 return cfg
FileNotFoundError: [Errno 2] No such file or directory: 'config/cfg.json'
However, If I move the notebook.ipynb
file to the root of git_project. Then I do not get this error. This is just one example. I have so many similiar problems in other modules of git_project and git_project contains the code which is already running in the production environment. So changing anything in git_project is not feasible here.
But as I said I do not want to move the notebooks inside of git_project but rather like to keep them in a parallel directory for analysis purposes. I can provide more information, if required.
I am using Python 3.6+ which does not even require to put init.py file anymore to make a directory package.
What should I do in order for this to work? Any help will be much appreciated.
Upvotes: 1
Views: 1760
Reputation: 5219
**Follow the path naming convention correctly to write your relative path"
if you want to import the Module from the Different Location Relative to your current file, you can use the following convention
yamlUtils.py is in the one folder above in the utilities/yaml/yamlUtils.py folder, make sure you have init.py file in all the folders having the python file.
import os , sys
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../utilities/")))
If you want to Read the Config Files Relatively, you can use the convention below, .. is mean to go one level above from the location where your notebook is, and then pass the relative path. if you will follow the same folder structure, you don't have to change anything that will always work.
Upvotes: 0
Reputation: 2416
A simple solution is to change the working directory.
Initially the working directory is the notebooks directory:
from sys import path
import os
print("Current Working Directory " , os.getcwd())
Output:
Current Working Directory /home/user/git_project/notebooks
Then you change it to the root of your project:
os.chdir(os.path.dirname(path[0]))
print("New Working Directory " , os.getcwd())
Output:
New Working Directory /home/user/git_project
After that, all imports with relative path on the project root directory should work.
Upvotes: 0
Reputation: 578
The issue was when you call open('config/cfg.json', 'r')
, the path it opens was relative to the directory where the python code is launched. In this case, it is your my_projects/notebook
directory. You can see this by adding the following prints in get_cfg()
inside config.py
:
print(os.getcwd()) # this prints out the current working directory
print(__file__) # this prints out the path of this script
As Ahmet suggested, modfiying the path to ../git_project/config/cfg.json
will work, but your python implementation will be tied to the notebook folder location. If you decide to restructure the notebook folder, it will break again.
One potential way is to parse the script path: __file__
:
import json
import os
def get_cfg():
script_dirname = os.path.dirname(__file__)
config_path = os.path.join(script_dirname, '..', 'config', 'cfg.json')
with open(config_path, 'r') as f:
cfg = json.load(f)
return cfg
Similar suggestion: (Reading file using relative path in python project). This is also the approach that is suggested in python-packaging docs:
Files which are to be used by your installed library (e.g. data files to support a particular computation method) should usually be placed inside of the Python module directory itself. ... That way, code which loads those files can easily specify a relative path from the consuming module’s
__file__
variable.
If you don't want to touch the current file inside git_project
, you can run a change directory command in your python notebook to point to the right location:
In [1]: %cd ../git_project
This line needs to be called once each time you restart the notebook kernel. You can verify the current working directory in the notebook as well:
In [2]: %ls
Upvotes: 1
Reputation: 388
As a followup from the discussion in the comments...
Even though approaches with relative paths work, often it is better to use a more scalable approach - environment variable with your project root.
In this case, your notebooks are truly independent of the project and can be used with as many as you want.
Here is a great explanation of how to use ENV
variables in Jupiter.
I prefer to use dotenv
approach.
Create .env
in your notebooks folder. Add your variable with your project path:
MY_PROJECT_ROOT=/usr/any/path/you/want
Then in your notebook
import os
from dotenv import load_dotenv
load_dotenv() # this line loads .env file
then your code
module_path = os.path.abspath(os.getenv('MY_PROJECT_ROOT'))
if module_path not in sys.path:
sys.path.append(module_path)
Upvotes: 0