Reputation: 37227
My friend just started learning Python and Flask, and is missing a lot of "best practices", e.g., a requirements.txt
file.
He has recently asked me for assistance, and to make the project clean, I want to setup a CI service (Travis), but I need to work out this file first.
Since he did not initially have a requirements.txt
, all information I can have is his import
statements, as well as his output of pip freeze
.
As there's no way to distinguish a direct requirement by the project and an indirect requirement by one of the packages, I want to find out all "top-level" packages from the list. A "top-level package" is a package that's not required by another package in the list. For example, urllib3
is required by requests
, so when requests
is present, urllib3
may better not appear in the final result.
Is there a way to achieve this?
If anyone wants to help me with this specific instance, here's the output of pip freeze
:
apturl==0.5.2
arrow==0.12.1
asn1crypto==0.24.0
binaryornot==0.4.4
blinker==1.4
Bootstrap-Flask==1.0.9
Brlapi==0.6.6
certifi==2018.1.18
chardet==3.0.4
Click==7.0
colorama==0.3.7
command-not-found==0.3
configparser==3.5.0
cookiecutter==1.6.0
cryptography==2.1.4
cupshelpers==1.0
decorator==4.1.2
defer==1.0.6
distro-info==0.18
dominate==2.3.5
Flask==1.0.2
Flask-Bootstrap4==4.0.2
Flask-Login==0.4.1
Flask-Mail==0.9.1
Flask-Moment==0.6.0
Flask-SQLAlchemy==2.3.2
Flask-WTF==0.14.2
future==0.17.1
httpie==0.9.8
httplib2==0.9.2
idna==2.6
ipython==5.5.0
ipython-genutils==0.2.0
itsdangerous==1.1.0
Jinja2==2.10
jinja2-time==0.2.0
keyring==10.6.0
keyrings.alt==3.0
language-selector==0.1
launchpadlib==1.10.6
lazr.restfulclient==0.13.5
lazr.uri==1.0.3
louis==3.5.0
macaroonbakery==1.1.3
Mako==1.0.7
MarkupSafe==1.1.0
mysqlclient==1.3.14
netifaces==0.10.4
oauth==1.0.1
olefile==0.45.1
pexpect==4.2.1
pickleshare==0.7.4
Pillow==5.1.0
poyo==0.4.2
prompt-toolkit==1.0.15
protobuf==3.0.0
pycairo==1.16.2
pycrypto==2.6.1
pycups==1.9.73
Pygments==2.2.0
pygobject==3.26.1
pymacaroons==0.13.0
PyNaCl==1.1.2
pyRFC3339==1.0
python-apt==1.6.3
python-dateutil==2.7.5
python-debian==0.1.32
pytz==2018.3
pyxdg==0.25
PyYAML==3.12
reportlab==3.4.0
requests==2.18.4
requests-unixsocket==0.1.5
ruamel.yaml==0.15.34
SecretStorage==2.3.1
simplegeneric==0.8.1
simplejson==3.13.2
six==1.11.0
SQLAlchemy==1.2.14
system-service==0.3
systemd-python==234
traitlets==4.3.2
ubuntu-drivers-common==0.0.0
ufw==0.35
unattended-upgrades==0.1
urllib3==1.22
usb-creator==0.3.3
visitor==0.1.3
wadllib==1.3.2
wcwidth==0.1.7
Werkzeug==0.14.1
whichcraft==0.5.2
WTForms==2.2.1
xkit==0.0.0
zope.interface==4.3.2
and here are the import
statements, with an additional pymysql
he told me.
import os
from flask import *
from flask_bootstrap import Bootstrap
from flask_moment import Moment
from flask_wtf import FlaskForm
from wtforms import *
from wtforms.validators import *
from flask_sqlalchemy import SQLAlchemy
from flask_mail import Mail, Message
from werkzeug.security import generate_password_hash,check_password_hash
from flask_login import login_required , login_user,login_fresh,login_url,LoginManager,UserMixin,logout_user
Upvotes: 4
Views: 2126
Reputation: 41112
First, I wanted to suggest using PIP's API, but it's recommended to use pip as a CmdLine tool only ([PyPA]: Using pip from your program). Note that I successfully used it, I just don't expose the code (at least for now).
Here's a way that uses pkg_resources ([ReadTheDocs]: Package Discovery and Resource Access using pkg_resources).
code00.py:
#!/usr/bin/env python
import os
import pkg_resources
import sys
def get_pkgs(reqs_file="requirements_orig.txt"):
if reqs_file and os.path.isfile(reqs_file):
ret = dict()
with open(reqs_file) as f:
for item in f.readlines():
name, ver = item.strip("\n").split("==")[:2]
ret[name] = ver, ()
return ret
else:
return {
item.project_name: (item.version, tuple([dep.name for dep in item.requires()])) for item in pkg_resources.working_set
}
def print_pkg_data(text, pkg_info):
print("{:s}\nSize: {:d}\n\n{:s}".format(text, len(pkg_info), "\n".join(["{:s}=={:s}".format(*item) for item in pkg_info])))
def main(*argv):
pkgs = get_pkgs(reqs_file=None)
full_pkg_info = [(name, data[0]) for name, data in sorted(pkgs.items())]
print_pkg_data("----------FULL LIST----------", full_pkg_info)
deps = set()
for name in pkgs:
deps = deps.union(pkgs[name][1])
min_pkg_info = [(name, data[0]) for name, data in sorted(pkgs.items()) if name not in deps]
print_pkg_data("\n----------MINIMAL LIST----------", min_pkg_info)
if __name__ == "__main__":
print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\nDone.\n")
sys.exit(rc)
Output:
(py_064_03.06.08_test0) e:\Work\Dev\StackOverflow\q054292236> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code00.py Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] 064bit on win32 ----------FULL LIST---------- Size: 133 Babel==2.6.0 Click==7.0 Django==2.1.4 Flask==1.0.2 Jinja2==2.10 Keras==2.2.4 Keras-Applications==1.0.6 Keras-Preprocessing==1.0.5 Markdown==3.0.1 MarkupSafe==1.1.0 Pillow==5.3.0 PyQt5==5.9.2 PyQt5-sip==4.19.13 PyYAML==3.13 Pygments==2.3.1 QtAwesome==0.5.3 QtPy==1.5.2 Send2Trash==1.5.0 Sphinx==1.8.3 Werkzeug==0.14.1 absl-py==0.6.1 alabaster==0.7.12 asn1crypto==0.24.0 astor==0.7.1 astroid==2.1.0 backcall==0.1.0 bleach==3.0.2 certifi==2018.11.29 cffi==1.11.5 chardet==3.0.4 cloudpickle==0.6.1 colorama==0.4.1 cryptography==2.4.2 cycler==0.10.0 decorator==4.3.0 defusedxml==0.5.0 djangorestframework==3.9.0 docutils==0.14 entrypoints==0.2.3 fatiando==0.5 funcsigs==1.0.2 future==0.17.1 gast==0.2.0 grpcio==1.17.1 h5py==2.9.0 html5lib==1.0.1 idna==2.8 imagesize==1.1.0 ipaddr==2.2.0 ipykernel==5.1.0 ipython==7.2.0 ipython-genutils==0.2.0 ipywidgets==7.4.2 isort==4.3.4 itsdangerous==1.1.0 jedi==0.13.2 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.2.4 jupyter-console==6.0.0 jupyter-core==4.4.0 keyboard==0.13.2 keyring==17.1.1 kiwisolver==1.0.1 lazy-object-proxy==1.3.1 llvmlite==0.26.0 lxml==4.2.5 matplotlib==3.0.2 mccabe==0.6.1 mistune==0.8.4 nbconvert==5.4.0 nbformat==4.4.0 notebook==5.7.4 numba==0.41.0 numpy==1.15.4 numpydoc==0.8.0 opencv-python==3.4.4.19 packaging==18.0 pandas==0.23.4 pandocfilters==1.4.2 parso==0.3.1 patsy==0.5.1 pickleshare==0.7.5 pip==18.1 prometheus-client==0.5.0 prompt-toolkit==2.0.7 protobuf==3.6.1 psutil==5.4.8 pyOpenSSL==18.0.0 pycodestyle==2.4.0 pycparser==2.19 pycryptodome==3.7.2 pyflakes==2.0.0 pygame==1.9.4 pylint==2.2.2 pynput==1.4 pyparsing==2.3.0 python-dateutil==2.7.5 pytz==2018.7 pywin32==224 pywin32-ctypes==0.2.0 pywinpty==0.5.5 pyzmq==17.1.2 qtconsole==4.4.3 requests==2.21.0 rope==0.11.0 scapy==2.4.0 scipy==1.2.0 setuptools==40.6.3 sip==4.19.8 six==1.12.0 snowballstemmer==1.2.1 sphinxcontrib-websupport==1.1.0 spyder==3.3.2 spyder-kernels==0.3.0 statsmodels==0.9.0 tensorboard==1.12.1 tensorflow-gpu==1.12.0 tensorflow-tensorboard==1.5.1 termcolor==1.1.0 terminado==0.8.1 testpath==0.4.2 thrift==0.11.0 tornado==5.1.1 traitlets==4.3.2 typed-ast==1.1.1 urllib3==1.24.1 wcwidth==0.1.7 webencodings==0.5.1 wheel==0.32.3 widgetsnbextension==3.4.2 wrapt==1.10.11 xlrd==1.2.0 ----------MINIMAL LIST---------- Size: 37 Babel==2.6.0 Click==7.0 Django==2.1.4 Flask==1.0.2 Keras==2.2.4 Keras-Applications==1.0.6 Keras-Preprocessing==1.0.5 Markdown==3.0.1 Pillow==5.3.0 PyQt5==5.9.2 PyQt5-sip==4.19.13 PyYAML==3.13 QtAwesome==0.5.3 QtPy==1.5.2 Sphinx==1.8.3 djangorestframework==3.9.0 fatiando==0.5 funcsigs==1.0.2 ipaddr==2.2.0 keyboard==0.13.2 lxml==4.2.5 opencv-python==3.4.4.19 pandas==0.23.4 patsy==0.5.1 pip==18.1 pyOpenSSL==18.0.0 pycryptodome==3.7.2 pygame==1.9.4 pynput==1.4 pywin32==224 scapy==2.4.0 spyder==3.3.2 statsmodels==0.9.0 tensorflow-gpu==1.12.0 tensorflow-tensorboard==1.5.1 thrift==0.11.0 xlrd==1.2.0
Notes:
(Stating the obvious): In order to get a pkg info, that pkg needs to be installed. That's why in my example I didn't used your file (I named it requirements_orig.txt), but the pkgs installed on my VEnv
As you can see, in my case the pkg number dropped from 133 to 37, which I'd say it's pretty manageable (of course, more filtering can be done)
I created the data structures based on the assumption that a pkg name is a primary key (uniquely identifies a pkg). If this is false, the code would require a bit of change
Final note: If you also want to consider your module's import list (to strip out even more pkgs, if possible), you could also try [Python.Docs]: modulefinder - Find modules used by a script (I used it in [SO]: What files are required for Py_Initialize to run? (@CristiFati's answer), only from CmdLine, but it should be trivial to use it from a script)
Upvotes: 2