Reputation: 3326
I need to construct a list of modules that are provided by a list of Python distributions specified in a requirements.txt file. The distributions will first be installed so they should be available for inspection locally.
It looks like I should be able to use pip.req.parse_requirements
to get the list of distributions from the requirements file. From there, how can I find the name of the module(s) that the distributions provide?
Upvotes: 4
Views: 681
Reputation: 931
Since, like you said, distributions are not the modules they contain, we run into a problem: The typical install process for a distribution -- which is, afaik, a collection of packages along with an installer -- is to download, unpack, and then run setup.py, which handles the remainder of the installation process.
The upshot is that, even given a Python distribution, you cannot actually tell what setup.py will do without running it. There may be conventions, and you may be able to pull out a lot of information and formulate a lot of good guesses, but running that 'setup.py' file is really the only way to see what it actually installs into site-packages. Hence, parse_requirements
, or really any of the pip internals really won't be useful for you, unless you're only interested in distributions.
So, that being said, I think the best way to manage your problem would be to:
pip -r requirements.txt
to actually install all packagessys.path
, looking for .py, .pyc and into subfolders for __init__.py?
files to build a list of modules.Step three may be doable in other, better, ways, I'm not sure. Further, you still run the risk of missing dynamically created modules or other trickiness, but this should capture the majority of modules.
Edit:
Here's some code that should work for everything but zip files:
import sys, os
def walk_modules_os(root):
def inner_walk(dir_path, mod_path):
filelist = os.listdir(dir_path)
pyfiles = set()
dirs = []
for name in filelist:
if os.path.isdir(os.path.join(dir_path, name)):
dirs.append(name)
else:
pre, ext = os.path.splitext(name)
if ext in ('.py', '.pyc', '.pyo'):
pyfiles.add(pre)
if len(mod_path):
if '__init__' not in pyfiles:
return
pyfiles.remove('__init__')
yield mod_path
for pyfile in pyfiles:
yield mod_path + (pyfile,)
for directory in dirs:
sub = os.path.join(dir_path, directory)
for mod in inner_walk(sub, mod_path + (directory,)):
yield mod
root = os.path.realpath(root)
if not os.path.isdir(root):
return iter([])
return iter(inner_walk(root, tuple()))
# you could collect as a set of tuples and do set subtraction, too
for path in sys.path:
for mod in walk_modules_os(path):
print mod
Edit 2:
Well, crikey. GWW has the right idea. A much better solution than mine.
Upvotes: 2
Reputation: 44093
You can probably use the built in pkgutil module if your python versions are 2.3+
For example,
import sys, pkgutil
mods = set()
#You may not need this part if you don't care about the builtin modules
print sys.builtin_module_names
for m in sys.builtin_module_names:
if m != '__main__':
mods.add(m)
#mods.add(m)
for loader, name, ispkg in pkgutil.walk_packages():
if name.find('.') == -1:
mods.add(name)
print mods
Upvotes: 3