Reputation: 362707
How to download a distribution, possibly sdist, without potentially executing a setup.py
file (that may contain malicious code)?
I don't want to recursively get the dependencies, only download one file for the specified distribution. Attempt that doesn't work:
pip download --no-deps mydist
Here is a reproducible example that demonstrates the setup.py
is still executed in the above case:
$ docker run --rm -it python:3.12 bash
root@c446ce6f3f8f:/# pip --version
pip 24.2 from /usr/local/lib/python3.12/site-packages/pip (python 3.12)
root@c446ce6f3f8f:/# pip download --no-deps suds==0.4
Collecting suds==0.4
Downloading suds-0.4.tar.gz (104 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-download-6tm69t6f/suds_30769ba44f9648edbed804a678c49d19/setup.py", line 20, in <module>
import suds
File "/tmp/pip-download-6tm69t6f/suds_30769ba44f9648edbed804a678c49d19/suds/__init__.py", line 154, in <module>
import client
ModuleNotFoundError: No module named 'client'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
I cannot use --no-binary
option, because I don't want to exclude source distributions entirely. I just want to avoid executing their source code.
Answers which download directly from PyPI are unacceptable: the filename is not known ahead of time, and the input is not necessarily a project name + version (pinned) but a requirement specifier. Any custom pip configurations (such as extra index url) must also be taken into account. In short, the solution should download the same file that a pip install
would have chosen.
Upvotes: 16
Views: 2381
Reputation: 6012
I've been digging into pip
, and sadly the code there is pretty convoluted. It seems that currently there is no way to do that, and according to the link provided by @doctaphred there are no plans to make progress in that direction.
The next step depends on your situation; If, for example, you need this "package downloader" for production, I'd suggest you write your own pypi client. It would be very simple to write and you could make it much faster and simpler than pip
by optimizing it for your needs. To do that you could try to use some of the existing code in pip
, but I think it will probably be pretty hard (after seeing that code).
Otherwise, I'd consider quicker, hackier methods to get the job done. The first solution that comes to mind is just stopping pip
whenever it tries to run the egg_info
command. To do that you can patch pip
's code at runtime using various methods. My favorite is using a usercutomize
file.
For example, create a patch file with the following content and place it in a directory of your choosing:
/pypatches/pip_pure_download/usercustomize.py
:
from pip._internal.req.req_install import InstallRequirement
print('Applying pure download patch!')
def override_run_egg_info(*args, **kwargs):
raise KeyboardInterrupt # Joke's on you, evil hackers! :P
InstallRequirement.run_egg_info = override_run_egg_info
Now to apply the patch to a python execution, just add the patch's directory to the PYTHONPATH
, for example:
PYTHONPATH=/pypatches/pip_pure_download:$PYTHONPATH pip download --no-deps suds
Upvotes: 8
Reputation: 2644
This doesn't seem to be possible as of pip 19.3.1 :(
See https://github.com/pypa/pip/issues/1884
Upvotes: 1