Reputation: 71
My first Python application was fairly small and all the code was in one directory. All the modules just did "import local_module" of each other. It was easy to do a "chmod +x" on the main script and run the application.
But now I'm creating a larger command line driven application which I'm expecting will run into tens of thousands of lines of code. So I want to scatter the code across various packages. This application will only run internally at work. Currently we're still using Python 2.6.6. It looks like there are a couple ways to structure things:
I've gotten the application working by having the main script do:
import __future__ import absolute_import
and then calling it via:
python -m main_dir.sub_dir.main_script
It looks like I can also change the Python path environment variable so that I can just call the main_script.py, or by something in the main script like:
sys.path.insert(0, os.path.join(THIS_DIR,'..'))
I don't feel like I have enough understanding to make a good judgement about what is the best way for setting up a multi-package application and using it. I've been doing various Google searches and found lots of references for how to run which seem to fall into the two basic approaches listed above. But I haven't found much on to set up a 50,000 line Python application.
Update:
Kevin, thank you for your answer. It helped improve my understanding of packages in Python, but I'm still a bit confused.
I created this directory structure:
my_app
\ sub_package1
\ sub_package2
In all three directories I created an empty __init__.py file.
In the directory my_app I created my_main.py with:
import sys
import sub_package1.sub1
import sub_package1.sub11
import at_main_level
import at_main_level_also
def _my_print (someString):
sys.stdout.write("{0}\n".format(someString))
if __name__ == '__main__':
_my_print ("Learning about Python packages and applications.")
x = 3
doubleX = at_main_level.double_number(x)
_my_print ("Double of {0} is {1}".format(x, doubleX))
doubleDoubleX = at_main_level_also.double_double_number(x)
_my_print ("Double Double of {0} is {1}".format(x, doubleDoubleX))
xMinus1 = sub_package1.sub1.subtract_1(x)
_my_print ("{0}-1 is {1}".format(x, xMinus1))
xMinus1Twice = sub_package1.sub11.subtract1_twice(x)
_my_print ("{0}-2 is {1}".format(x, xMinus1Twice))
Also in my_app I created at_main_level.py with:
def double_number(x):
return 2 *x
And finally in my_app I created at_main_level_also.py:
import at_main_level
def double_double_number(x):
return at_main_level.double_number(at_main_level.double_number(x))
Then in sub_package1 I created sub1.py:
def subtract_1 (x):
return x - 1
and sub11.py:
import sub1
def subtract1_twice(x):
return sub1.subtract_1(sub1.subtract_1(x))
Now when I run my_main.py I get results that make sense to me:
Learning about Python packages and applications.
Double of 3 is 6
Double Double of 3 is 12
3-1 is 2
3-2 is 1
So I can write code:
1) Where one module uses code in another module at the top level.
2) Where a module at the top level can use code in a sub package, and code in the sub package can call a function defined in another module in the sub package.
But when in sub_package2 I create sub2.py:
from ..sub_package1 import sub1
def subtract_2 (x):
return sub1(sub1(x))
Add to my_main.py
import sub_package2.sub2
and at the end of my main function:
xMinus2 = sub_package2.sub2.subtract_2(x)
_my_print ("{0}-1 is {1}".format(x, xMinus2))
I get the following when running my_main:
[my_app]$ python my_main.py
Traceback (most recent call last):
File "./my_main.py", line 8, in <module>
import sub_package2.sub2
File "/home/hcate/work/temp/my_app/sub_package2/sub2.py", line 2, in <module>
from ..sub_package1 import sub1
ValueError: Attempted relative import beyond toplevel package
So I don't yet understand how to create an application where code in one package can use code in another package.
What should I do different?
Thanks.
Upvotes: 4
Views: 1627
Reputation: 43782
Creating multiple packages is not actually different from creating a single package correctly.
Don't ever mix up "directories in which your code is located" and "directories which are Python packages". That is, you should have one directory in which all of the packages and modules live — let's say it's /users/henry/myappcode/
. Within that directory, there may be directories which are Python packages to organize the code; the difference is that the latter directories always appear in imports (when you are using absolute imports) and the former never do. As long as you do this, you don't need to mess with sys.path
either.
Here is how to launch your app with the proper path so that the imports will work:
Simple version for getting started. You have files like this:
my_app_main.py
my_app/__init__.py
my_app/module1.py
my_app/module2.py
my_app/subpackage/...
my_app/...
and you can run it like this, from any directory, with an absolute or relative path:
python /users/henry/myappcode/my_app_main.py
When you invoke Python with the path to a Python script, Python automatically puts the location of that script (not the current directory) on sys.path
, and so all of your .py
files will automatically be able to import my_app.module
and so on.
The above has the disadvantage that it has a "main script" which is different from a module and not in the same package as the rest of your code. You already know python -m
, so the way to make it work in this case is:
PYTHONPATH=/users/henry/myappcode python -m my_app.main
given that you create my_app/main.py
instead of my_app_main.py
as above.
Option 2 is ugly. The way to make it pretty is to make your Python package installable (setup.py
). Using the entry_points
option, you can have a shell script automatically created which does the equivalent of the above command, but users only need to type my_app
after installing.
If you want to do development and not have to run python setup.py install
every time, but have a convenient command, then just make your own shell script pointing at your code.
#!/bin/sh
PYTHONPATH=/users/henry/myappcode exec python -m my_app.main "$@"
In the directory my_app I created my_main.py with:
import sub_package1.sub1
If you execute 'python my_app/my_main.py', then Python's path is set so that my_app is not a package, it's a directory containing top-level packages. This is why your later relative import fails with relative import beyond toplevel package
: you have arranged for sub_package1
to be a toplevel package.
Here is a general rule: you must never name a package directory on the command line. You can name a package, with -m
(options 2 and 3 above), or you can invoke a script which is not inside a package (option 1).
Your relative import is correct, but it is failing because Python is not seeing my_app
as being a package. Make sure that you are following either option 1 (main script not inside my_app
) or option 2 (main launched with -m
).
Upvotes: 7