Oliver
Oliver

Reputation: 29493

python install without pip on ubuntu

I need to install a Python package in a custom Docker container that I'm building from the official 'ubuntu' Docker image, so I want to minimize how much space this uses. Python3 installs fine and runs, but for some reason, pip is not included.

So I installed via apt install python3-pip, this works but it is a whopping 300 megs and takes a couple of minutes to install (apparently because it installs a sh*load of stuff to build binary packages from gcc etc).

Of course I could uninstall python3-pip from the image after installing the dependencies I want, and additionally use apt autoremove to get rid of 299 megs. However this takes another minute.

So although the above works, it significantly increases the build time of my Docker image. So I tried to see if there was a way of installing the dependency without pip:

I tried downloading the dependency's .tar.gz from PyPI, extracted, and tried python3 setup.py install, but this gets me an odd error:

Traceback (most recent call last):
  File "setup.py", line 59, in <module>
    from distutils import log
ImportError: cannot import name 'log'

I thought perhaps I need to install setuptools, or upgrade distutils.

I tried to use get-pip.py from the official site but that failed too:

Traceback (most recent call last):
  File "get-pip.py", line 20890, in <module>
    main()
  File "get-pip.py", line 197, in main
    bootstrap(tmpdir=tmpdir)
  File "get-pip.py", line 82, in bootstrap
    import pip._internal
  File "/tmp/tmpjpa5gs_x/pip.zip/pip/_internal/__init__.py", line 40, in <module>
  File "/tmp/tmpjpa5gs_x/pip.zip/pip/_internal/cli/autocompletion.py", line 8, in <module>
  File "/tmp/tmpjpa5gs_x/pip.zip/pip/_internal/cli/main_parser.py", line 8, in <module>
  File "/tmp/tmpjpa5gs_x/pip.zip/pip/_internal/cli/cmdoptions.py", line 17, in <module>
  File "/tmp/tmpjpa5gs_x/pip.zip/pip/_internal/locations.py", line 10, in <module>
ImportError: cannot import name 'sysconfig'

which is very weird because if I start python3, import sysconfig works fine.

I also tried apt install python-pyyaml (the dependency I need in my Docker image) but that doesn't seem to exist.

So I'm out of options.

Upvotes: 7

Views: 3487

Answers (4)

Floating Octothorpe
Floating Octothorpe

Reputation: 545

I ran into a similar issue and wanted to give an alternative solution.

On Ubuntu 20.04 build-essential and python3-dev are recommended packages for python3-pip, therefore you can use the --no-install-recommends option to skip them:

RUN apt update -y && \
    apt install python3 python3-pip --no-install-recommends -y && \
    apt clean

This took my image from 420MB to 165MB, and obviously the build time was also quicker.

Note: this will work fine for pure-Python packages, but you will likely need build-essential and python3-dev if you want to compile anything

Useful links

Upvotes: 3

Oliver
Oliver

Reputation: 29493

When there is no apt install something that is available for a python package, here is how to do it. Thanks to @Anthon and @digitalarbeiter as their answers provided important information to arrive at solution.

  • To install via a setup.py file (esp. useful in Ubuntu Docker container):

    • it was sufficient for me to

      apt install python3-distutils
      <download package, tar xvf, cd to folder>
      python3 setup.py install
      
    • This method of installation only works for pure Python packages (should not not be a surprise), which means that Python packages that have non-pure Python dependencies may not install or, if they do, will have some functionality unavailable.

    • Note that even before installing python3-distutils, python3 -m distutils worked; this implies that the builtin distutils, that comes with Python3 via apt install python3, is not the full distutils; I did not know that, is this fact ever mentioned anywhere?
  • To install pip without the gcc toolchain: it was sufficient for me to

    apt install python3-distutils
    wget https://bootstrap.pypa.io/get-pip.py
    python3 get-pip.py
    

    Then pip install pyyaml completed. It seemed to install from a .tar.gz so it is the pure Python implementation too. Not surprising. This technique is useful if a package is not installable via apt install python3-<package>

The above methods required only a few megs of disk space.

A couple other missing pieces of the puzzle for me were:

  • apt install python3-<something>:

    • I missed the fact that many Python packages are distributed this way in Debian, handy for packages that have a C implementation (or C dependencies) since no compilation necessary.
    • AND I didn't know that for packages that are called py<something> on PyPI, the apt install is python3-<something> NOT python3-py<something>. Unfortunately apt search pyyaml was no help here.
  • apt search <something>: I had sort of forgotten about it because bash on (desktop) Ubuntu automatically suggests the right package to download when a command is not found.

    • In particular apt search yaml yields 81 packages that provide YAML read/write in several languages (Python 2, Python 3, nodejs, Java, Go, Ruby, Erlang, Lua, Perl, C, C++, Clojure), linter, schema validator, etc.
    • Multiple search terms are AND'ed so apt search yaml python3 showed the python3-yaml that I missed.
    • Unfortunately apt search pyyaml produces no results, eventhough the Source and Homepage fields of apt show python3-yaml contain the word "pyyaml". I could not find a way to make search include those fields.

Upvotes: 1

Anthon
Anthon

Reputation: 76599

I have been installing pip from the pip-get.py in Docker (Ubuntu) containers for a few years without problem. For me it is the best way to not-get pip-out-of-date warnings or (at some point, some time ago) SSL related errors.

So the second part of your answer is close, but your python install seems a bit too minimal, you need sysconfig as provided by python-distutils. You can try this rather minimal Dockerfile:

FROM ubuntu:latest

MAINTAINER Anthon

ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update && apt-get install -y \
  python3 \
  python3-distutils \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# this gets you the latest pip
COPY pip/get-pip.py /tmp/get-pip.py
RUN python3 /tmp/get-pip.py

RUN pip3 install pyyaml

which I ran using this Makefile:

doit:   pip/get-pip.py
        docker build .

pip/get-pip.py:
        -@mkdir pip
        curl https://bootstrap.pypa.io/get-pip.py -o pip/get-pip.py

(those need to be TAB characters on the indented lines) to make sure pip-get.py is available from the context (you can of course download it from within the Dockerfile, but that is not necessary). That ends in a succesful PyYAML install, but it will be slow.

I recommend you start using ruamel.yaml (disclaimer: I am the author of that package), by changing the last line of the Dockerfile to read:

RUN pip3 install ruamel.yaml

Apart from many bugfixes in the original PyYAML code it is based upon, ruamel.yaml supports YAML 1.2 and YAML 1.1 (replaced in 2009 and the version PyYAML supports), and installs the appropriate version from a .whl file, so you'll have the fast C loader available in your container (PyYAML doesn't do that).

You can load a YAML file using the C-loader in ruamel.yaml using:

from pathlib import Path
from ruamel.yaml import YAML


path = Path('yourfile.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

Upvotes: 1

digitalarbeiter
digitalarbeiter

Reputation: 2335

The Debian and Ubuntu package for PyYAML is called python-yaml, not python-pyyaml.

sudo apt install python-yaml

or

sudo apt install python3-yaml

respectively.

(It seems to be common in Debian/Ubuntu package names to drop any additional "py" prefix that a Python package might have: it's apt install python-tz instead of python-pytz, too. They don't seem to like the py-redundancy.)

Upvotes: 2

Related Questions