Python project structure - Avoid setting directory in all files

Question

I have a Python project called MyProject. It contains the following:

__init__.py (empty, i.e. no code)
main.py (used to run "my final project")
A folder called data which for now only contains data.sqlite
A foller called utils which has a __init__.py and a few other .py-files e.g data_handler.py.
Some other folders with the same structure as utils.

In my .py-files, e.g. data_handler.py, I include the usual

if __name__ == "__main__":

in order to run the file as a single module without executing everything from main.py.

As many of the files need to use the data from data.sqlite I use relative paths to find this. However, at the moment I end all of my files with

if __name__ == "__main__":
    os.chdir('C:\Users\my_pcs_username\Desktop\MyProject\')
    # code to be executed in this file

This does not seem like the right way to work with multiple files but the code does not work if I do not include the change of directory. When I run any of the files in PyCharm the termial initialized with the following line: C:\Users\my_pcs_username\anaconda3\envs\MyProject\python.exe C:/Users/my_pcs_username/Desktop/MyProject/utils/data_handler.py

What is the right way to make a project like this where each of the files can run as its own module?

JL Peyret · Accepted Answer

This is only answering how to find your non-Python files without chdir and the like *.

How about this?

Basically, use pathlib.Path together with the __file__ variable that is always around (it might be absent in exotic environments where Python code is not stored in the file system) and corresponds to your Python source file.

from pathlib import Path

PA_SCRIPT = Path(__file__)

def main():
    sqldata = PA_SCRIPT.parent / "data" / "data.sqlite"
    data = sqldata.read_text()
    print(f"main:{data}")

if __name__ == "__main__":
    main()

files

.
├── __init__.py
├── data
│   └── data.sqlite
└── main.py

% cat data/data.sqlite 
I am your db

program output:

% py main.py          
main: I am your db

* How to make independently-runnable scripts isn't necessarily super-complicated, but assuming people even agree it's a good idea, there are probably multiple ways to do it.

I use click and sprinkle code that looks something like this for Python files that have a reason to be run solo.

Which can be a rather open interpretation: my constants.py file can be run solo and it basically pretty-prints its globals(). Which sounds odd, except many of the values come from environment variables so are not visible in the source code.

With regards to the data file, you'd use the same mechanism, but just take into account the relative position of your python script to the data directory. So sqldata = PA_SCRIPT.parent.parent / "data" / "data.sqlite" (I think).

import click

# notice how this is imported all the way from the top?
# relative imports always give me a hard time.
from MyProject.utils.data_handler import Foo

# or to load the whole module
import MyProject.utils.data_handler as data_handler

@click.command()
def main():
   ...

if __name__ == "__main__":
    main()

Python project structure - Avoid setting directory in all files

Answers (1)

files

program output:

Related Questions