wim
wim

Reputation: 362707

pip download without executing setup.py

How to download a distribution, possibly sdist, without potentially executing a setup.py file (that may contain malicious code)?

I don't want to recursively get the dependencies, only download one file for the specified distribution. Attempt that doesn't work:

pip download --no-deps mydist

Here is a reproducible example that demonstrates the setup.py is still executed in the above case:

$ docker run --rm -it python:3.12 bash
root@c446ce6f3f8f:/# pip --version
pip 24.2 from /usr/local/lib/python3.12/site-packages/pip (python 3.12)
root@c446ce6f3f8f:/# pip download --no-deps suds==0.4
Collecting suds==0.4
  Downloading suds-0.4.tar.gz (104 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-download-6tm69t6f/suds_30769ba44f9648edbed804a678c49d19/setup.py", line 20, in <module>
          import suds
        File "/tmp/pip-download-6tm69t6f/suds_30769ba44f9648edbed804a678c49d19/suds/__init__.py", line 154, in <module>
          import client
      ModuleNotFoundError: No module named 'client'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I cannot use --no-binary option, because I don't want to exclude source distributions entirely. I just want to avoid executing their source code.

Answers which download directly from PyPI are unacceptable: the filename is not known ahead of time, and the input is not necessarily a project name + version (pinned) but a requirement specifier. Any custom pip configurations (such as extra index url) must also be taken into account. In short, the solution should download the same file that a pip install would have chosen.

Upvotes: 16

Views: 2381

Answers (2)

kmaork
kmaork

Reputation: 6012

I've been digging into pip, and sadly the code there is pretty convoluted. It seems that currently there is no way to do that, and according to the link provided by @doctaphred there are no plans to make progress in that direction.

The next step depends on your situation; If, for example, you need this "package downloader" for production, I'd suggest you write your own pypi client. It would be very simple to write and you could make it much faster and simpler than pip by optimizing it for your needs. To do that you could try to use some of the existing code in pip, but I think it will probably be pretty hard (after seeing that code).

Otherwise, I'd consider quicker, hackier methods to get the job done. The first solution that comes to mind is just stopping pip whenever it tries to run the egg_info command. To do that you can patch pip's code at runtime using various methods. My favorite is using a usercutomize file.

For example, create a patch file with the following content and place it in a directory of your choosing:

/pypatches/pip_pure_download/usercustomize.py:

from pip._internal.req.req_install import InstallRequirement

print('Applying pure download patch!')

def override_run_egg_info(*args, **kwargs):
    raise KeyboardInterrupt # Joke's on you, evil hackers! :P

InstallRequirement.run_egg_info = override_run_egg_info

Now to apply the patch to a python execution, just add the patch's directory to the PYTHONPATH, for example:

PYTHONPATH=/pypatches/pip_pure_download:$PYTHONPATH pip download --no-deps suds

Upvotes: 8

doctaphred
doctaphred

Reputation: 2644

This doesn't seem to be possible as of pip 19.3.1 :(

See https://github.com/pypa/pip/issues/1884

Upvotes: 1

Related Questions