Tommy
Tommy

Reputation: 13652

How should I generate requirements.txt? Pip Freeze not a good way

How should I generate requirements.txt for Python projects?

Here is the problem I am having with pip freeze. Suppose my package P requires A, B, C. Suppose C is a library that imports X, Y, Z, but only X is needed by P. Then if I:

1) Install A
2) Install B
3) Install C, which installs X, Y, Z
4) Do a pip freeze into P's requirements.txt 

Then P's requirements.txt will look like:

1) A
2) B
3) C
4) X
5) Y
6) Z

But Y and Z are not actually required in my Python installation for P to run.

Many of the answers assume that, Y must be. However, python is a dynamic language. It is very often the case that, for example C is a huge library that uses numpy or pandas for some functionality; but P doesn't call that part of the library - in this case, I don't really need to pull those in if I know what parts of C that P needs. If all libraries were "small"; this would be rare, however there are a lot of "kitchen sink" libraries.

As far as I can tell, running pip freeze to generate P's requirements will show you all dependencies of dependencies, and thus is a superset of P's actual dependencies.

Upvotes: 11

Views: 18063

Answers (4)

alegria
alegria

Reputation: 1482

I've answered this question in a different stackoverflow post https://stackoverflow.com/a/65666949/1512555 where I recommended using pip-compile from pip-tools

Upvotes: 1

Foxy Fox
Foxy Fox

Reputation: 491

  1. Install the pipreqs library (e.g. conda install -c conda-forge pipreqs)
  2. Change dir to the project folder (cd your/repository)
  3. Run the command pipreqs --force

Or just pipreqs --force your/repository.

See additional information in the official source: https://pypi.org/project/pipreqs/

Upvotes: 3

gopiariv
gopiariv

Reputation: 494

There is a python module called pipreqs . It generates requirements.txt based on imports in the project.

Upvotes: 6

Kevin
Kevin

Reputation: 30151

The purpose of a virtualenv is to have total control over the packages installed.

Suppose you only listed A, B, C, and X. Every time you create a new virtualenv from that requirements file, you'll get the latest versions of Y and Z. There are several problems with this:

  1. You can't know you're not using Y: For a sufficiently complex project, it is nearly impossible to audit every codepath to ensure C never calls into Y. You're not just worrying about your own code any more; you're worrying about C's code as well. This just doesn't scale.
  2. Even if you're just importing Y, you're using it: Python allows arbitrary code execution at import time. A new version of Y could do all sorts of obnoxious things at import time, such as printing to stdout, monkey patching X, or just about anything else you can imagine. A well-designed Y shouldn't do these things, but you'll find the quality of packages on PyPI highly variable.
  3. New versions of Y can pull in new dependencies: If you include a new version of Y, you could end up adding package W to your virtualenv too, because the new version of Y requires it. As more packages are added, the first two problems are exacerbated. Worse, you might find that the new version of Y depends on a newer version of X, in which case you won't end up with the packages you actually want.
  4. Producing a known-good configuration is more important: pip freeze is not designed to figure out minimal requirements. It is designed to enable deploying a complete application to many different environments consistently. That means it will err on the side of caution and list everything which could reasonably affect your project.

For these reasons, you should not try to remove Y and Z from your requirements file.

Upvotes: 13

Related Questions