logoff
logoff

Reputation: 3446

Working Poetry project with private dependencies inside Docker

I have a Python library hosted in Google Cloud Platform Artifact Registry. Besides, I have a Python project, using Poetry, that depends on the library.

This is my project file pyproject.toml:

[tool.poetry]
name = "Test"
version = "0.0.1"
description = "Test project."
authors = [
    "Me <[email protected]>"
]

[tool.poetry.dependencies]
python = ">=3.8,<4.0"
mylib = "0.1.1"

[tool.poetry.dev-dependencies]
"keyrings.google-artifactregistry-auth" = "^1.1.0"
keyring = "^23.9.0"

[build-system]
requires = ["poetry-core>=1.1.0"]
build-backend = "poetry.core.masonry.api"

[[tool.poetry.source]]
name = "my-lib"
url = "https://us-east4-python.pkg.dev/my-gcp-project/my-lib/simple/"
secondary = true

To enable using my private repository, I installed gcloud CLI and authenticated with my credentials. So when I run this command, I see proper results, like this:

$ gcloud auth list
ACTIVE  ACCOUNT
...
*       <my-account>@appspot.gserviceaccount.com
...

Additionally, I'm using Python keyring togheter with keyrings.google-artifactregistry-auth, as you can see in the project file.

So, with this setup, I can run poetry install, the dependency gets downloaded from my private artifact registry, using the authentication from GCP.


The issue comes when I try to apply the same principles inside a Docker container.

I created a Docker file like this:

# syntax = docker/dockerfile:1.3
FROM python:3.9

# Install Poetry
RUN curl -sSL https://install.python-poetry.org | python3 -
ENV PATH "${PATH}:/root/.local/bin"

# Install Google Cloud SDK CLI
ARG GCLOUD_VERSION="401.0.0-linux-x86_64"
RUN wget -q https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-${GCLOUD_VERSION}.tar.gz && \
    tar -xf google-cloud-cli-*.tar.gz && \
    ./google-cloud-sdk/install.sh --quiet && \
    rm google-cloud-cli-*.tar.gz
ENV PATH "${PATH}:/google-cloud-sdk/bin"

# install Google Artifact Rrgistry keyring integration
RUN pip install keyrings.google-artifactregistry-auth
RUN --mount=type=secret,id=GOOGLE_APPLICATION_CREDENTIALS ${GOOGLE_APPLICATION_CREDENTIALS} gcloud auth activate-service-account --key-file=/run/secrets/GOOGLE_APPLICATION_CREDENTIALS
RUN gcloud auth list
RUN keyring --list-backends

WORKDIR /app

# copy Poetry project files and install dependencies
COPY ./.env* ./
COPY ./pyproject.toml ./poetry.lock* ./
RUN poetry install

# copy source files
COPY ./app /app/app

# run the program
CMD poetry run python -m app.main

As you can see, I injected the Google credentials file, following this documentation. This works. I used Docker BuildKit secrets, as exposed here (security concerns are not a matter of this question). So, when I try to build the image, I got an authentication error (GOOGLE_APPLICATION_CREDENTIALS is properly set pointing to a valid key file):

$ DOCKER_BUILDKIT=1 docker image build --secret id=GOOGLE_APPLICATION_CREDENTIALS,src=${GOOGLE_APPLICATION_CREDENTIALS} -t app-test .

...
#19 66.68 <c1>Source (my-lib):</c1> Authorization error accessing https://us-east4-python.pkg.dev/my-gcp-project/my-lib/simple/mylib/
#19 68.21
#19 68.21   RuntimeError
#19 68.21
#19 68.22   Unable to find installation candidates for mylib (0.1.1)
...

If I execute, line by line, all the commands in the Dockerfile, using the same Google credentials key file outside Docker, I got it working.

I even tried to debug inside the image, not executing poetry install, nor poetry run... commands, and I saw this, if it helps to debug:

# gcloud auth list
                  Credentialed Accounts
ACTIVE  ACCOUNT
*       <my-account>@appspot.gserviceaccount.com

# keyring --list-backends
keyrings.gauth.GooglePythonAuth (priority: 9)
keyring.backends.chainer.ChainerBackend (priority: -1)
keyring.backends.fail.Keyring (priority: 0)

Finally, I even tried following this approach: Using Keyring on headless Linux systems in a Docker container, with the same results:

# apt update
...
# apt install -y gnome-keyring
...
# dbus-run-session -- sh
GNOME_KEYRING_CONTROL=/root/.cache/keyring-MEY1T1
SSH_AUTH_SOCK=/root/.cache/keyring-MEY1T1/ssh
# poetry install
...
  • Installing mylib (0.1.1): Failed

  RuntimeError

  Unable to find installation candidates for mylib (0.1.1)

  at ~/.local/share/pypoetry/venv/lib/python3.9/site-packages/poetry/installation/chooser.py:103 in choose_for
       99│
      100│             links.append(link)
      101│
      102│         if not links:
    → 103│             raise RuntimeError(f"Unable to find installation candidates for {package}")
      104│
      105│         # Get the best link
      106│         chosen = max(links, key=lambda link: self._sort_key(package, link))
      107│
...

I even tried following the advices of this other question. No success.

gcloud CLI works inside the container, testing other commands. My guess is that the integration with Keyring is not working properly, but I don't know how to debug it.

How can I get my dependency resolved inside a Docker container?

Upvotes: 12

Views: 5673

Answers (4)

martin
martin

Reputation: 1328

Poetry 2.0 has just been released, and it supports specifying plugins in pyproject.toml, so the googles keyring backend can be declared via:

[tool.poetry.requires-plugins]
keyrings-google-artifactregistry-auth = "^1.1.2"

This way, a separate poetry self add is not required any more.

Upvotes: 0

logoff
logoff

Reputation: 3446

Finally, I found a solution that worked in my use case.

There are two main parts:

  1. Installing keyrings.google-artifactregistry-auth as a Poetry plugin, using this command:
poetry self add keyrings.google-artifactregistry-auth
  1. Authenticating inside the container using a service account key file:
gcloud auth activate-service-account --key-file=key.json

In my case, I use BuildKit secrets to handle it.

Then, for instance, the Dockerfile would like this:

FROM python:3.9

# Install Poetry
RUN curl -sSL https://install.python-poetry.org | python3 -
ENV PATH "${PATH}:/root/.local/bin"

# install Google Artifact Registry tools for Python as a Poetry plugin
RUN poetry self add keyrings.google-artifactregistry-auth

# Install Google Cloud SDK CLI
ARG GCLOUD_VERSION="413.0.0-linux-x86_64"
RUN wget -q https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-${GCLOUD_VERSION}.tar.gz && \
    tar -xf google-cloud-cli-*.tar.gz && \
    ./google-cloud-sdk/install.sh --quiet && \
    rm google-cloud-cli-*.tar.gz
ENV PATH "${PATH}:/google-cloud-sdk/bin"

# authenticate with gcloud using a BuildKit secret
RUN --mount=type=secret,id=gac.json \
    gcloud auth activate-service-account --key-file=/run/secrets/gac.json

COPY ./pyproject.toml ./poetry.lock* /
RUN poetry install

# deauthenticate with gcloud once the dependencies are already installed to clean the image
RUN gcloud auth revoke --all

COPY ./app /app

WORKDIR /app

CMD ["whatever", "command", "you", "use"]

And the Docker build command, providing the secret:

DOCKER_BUILDKIT=1 docker image build \
        --secret id=gac.json,src=${GOOGLE_APPLICATION_CREDENTIALS} \
        -t ${YOUR_TAG} .

And with Docker Compose, a similar approach:

services:
  yourapp:
    build:
      context: .
      secrets:
        - key.json
    image: yourapp:yourtag
    ...
COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker compose up --build

Upvotes: 5

Da Chucky
Da Chucky

Reputation: 781

I think the issue here is that poetry can't get the credentials from the keyring as it doesn't have keyrings.google-artifactregistry-auth installed, and it can't install keyrings.google-artifactregistry-auth because it fails to install the private package. To solve this, you need to bootstrap or pre-install keyrings.google-artifactregistry-auth. tox suffers from a similar issue, but has a documented method to deal with it.

The documentation for poetry notes a similar issue for Azure in the second info box under the Configuring Credentials section. However, Azure solves the bootstrapping issue it by having a package that pre-seeds new virtual environments with the necessary packages for authentication. I couldn't find an equivalent for GCE, so that's not an option unless you develop one yourself.

Alternatively, since you are including a lock file with your project you could try exploiting the dependency groups option to get poetry to setup the virtual environment and install keyrings.google-artifactregistry-auth before installing everything else. I haven't tried this as I don't have a GCE or Azure account, but I figured I would share it on the off-chance it works:

Add the following section to pyproject.toml:

[tool.poetry.group.seed.dependencies]
keyrings.google-artifactregistry-auth = "^1.1.1"

Then use the following in your Dockerfile to replace your current authentication and install sections:

# copy Poetry project files and install dependencies
WORKDIR /app
COPY ./.env* ./
COPY ./pyproject.toml ./poetry.lock* ./

# install Google Artifact Registry keyring integration
RUN poetry install --only seed
RUN --mount=type=secret,id=GOOGLE_APPLICATION_CREDENTIALS ${GOOGLE_APPLICATION_CREDENTIALS} gcloud auth activate-service-account --key-file=/run/secrets/GOOGLE_APPLICATION_CREDENTIALS
RUN gcloud auth list
RUN poetry run keyring --list-backends

RUN poetry install

Upvotes: 0

Divyessh
Divyessh

Reputation: 2721

You are using ${GOOGLE_APPLICATION_CREDENTIALS} in your dockerfile command but you have not defined it anywhere in the Dockerfile using ENV or ARG.

For example in your Dockerfile in this section

RUN wget -q https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-${GCLOUD_VERSION}.tar.gz && \

You are using the variable GCLOUD_VERSION and you have defined it in your Dockerfile over here

ARG GCLOUD_VERSION="401.0.0-linux-x86_64"

So when you are using the variable in this line:

RUN --mount=type=secret,id=GOOGLE_APPLICATION_CREDENTIALS ${GOOGLE_APPLICATION_CREDENTIALS} gcloud auth activate-service-account --key-file=/run/secrets/GOOGLE_APPLICATION_CREDENTIALS

You need to define it as ENV in the Dockerfile.

I hope this helps!

Upvotes: 0

Related Questions