Reputation: 1301
I'm learning how to package Python projects for PyPI according to the tutorial (https://packaging.python.org/en/latest/tutorials/packaging-projects/). For the example project, they use the folder structure:
packaging_tutorial/
├── LICENSE
├── pyproject.toml
├── README.md
├── src/
│ └── example_package_YOUR_USERNAME_HERE/
│ ├── __init__.py
│ └── example.py
└── tests/
I am just wondering why the src/
folder is needed? Does it serve a particular purpose? Could one instead include the package directly in the top folder? E.g. would
packaging_tutorial/
├── LICENSE
├── pyproject.toml
├── README.md
├── example_package_YOUR_USERNAME_HERE/
│ ├── __init__.py
│ └── example.py
└── tests/
have any disadvantages or cause complications?
Upvotes: 20
Views: 4525
Reputation: 36289
There is an interesting blog post about this topic; basically, using src
prevents that when running tests from within the project directory, the package source folder gets imported instead of the installed package (and tests should always run against installed packages, so that the situation is the same as for a user).
Consider the following example project where the name of the package under development is mypkg
. It contains an __init__.py
file and another DATA.txt
non-code resource:
.
├── mypkg
│ ├── DATA.txt
│ └── __init__.py
├── pyproject.toml
├── setup.cfg
└── test
└── test_data.py
Here, mypkg/__init__.py
accesses the DATA.txt
resource and loads its content:
from importlib.resources import read_text
data = read_text('mypkg', 'DATA.txt').strip() # The content is 'foo'.
The script test/test_data.py
checks that mypkg.data
actually contains 'foo'
:
import mypkg
def test():
assert mypkg.data == 'foo'
Now, running coverage run -m pytest
from within the base directory gives the impression that everything is alright with the project:
$ coverage run -m pytest
[...]
test/test_data.py . [100%]
========================== 1 passed in 0.01s ==========================
However, there's a subtle issue. Running coverage run -m pytest
invokes pytest
via python -m pytest
, i.e. using the -m
switch. This has a "side effect", as mentioned in the docs:
[...] As with the
-c
option, the current directory will be added to the start ofsys.path
. [...]
This means that when importing mypkg
in test/test_data.py
, it didn't import the installed version but it imported the package from the source tree in mypkg
instead.
Now, let's further assume that we forgot to include the DATA.txt
resource in our project specification (after all, there is no MANIFEST.in
). So this file is actually not included in the installed version of mypkg
(installation e.g. via python -m pip install .
). This is revealed by running pytest
directly:
$ pytest
[...]
======================= short test summary info =======================
ERROR test/test_data.py - FileNotFoundError: [Errno 2] No such file ...
!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!
========================== 1 error in 0.13s ===========================
Hence, when using coverage
the test passed despite the installation of mypkg
being broken. The test didn't capture this as it was run against the source tree rather than the installed version. If we had used a src
directory to contain the mypkg
package, then adding the current working directory via -m
would have caused no problems, as there is no package mypkg
in the current working directory anymore.
But in the end, using src
is not a requirement but more of a convention/best practice. For example requests doesn't use src
and they still manage to be a popular and successful project.
Upvotes: 18