yoka791
yoka791

Reputation: 656

How to install google-cloud-bigquery on python-alpine based docker?

I'm trying to build a docker with python 3 and google-cloud-bigquery with the following docker file:

FROM python:3.10-alpine

RUN pip3 install google-cloud-bigquery

WORKDIR /home

COPY *.py /home/

ENTRYPOINT ["python3", "-u", "myscript.py"]

But getting errors on the pip3 install google-cloud-bigquery (too long for here)..
What's missing for installing this on python-alpine?

Upvotes: 0

Views: 1637

Answers (2)

Zeka
Zeka

Reputation: 21

Actually it seems like not a problem with numpy, which builds smoothly with all the dependency libs install, but rather with pyarrow, which does not support alpine+pip build. I've found a workaround by using alpine pre-built version of pyarrow. It is much easier than building pyarrow from source. This build works for me just fine:

FROM python:3.10.6-alpine3.16

RUN apk add --no-cache build-base linux-headers \
    py3-apache-arrow=8.0.0-r0

# Copying pyarrow to site-package of actual python path. Alpine python path
# and python's docker hub path are different.
RUN mv /usr/lib/python3.10/site-packages/*  \
    /usr/local/lib/python3.10/site-packages/
RUN rm -rf /usr/lib/python3.10

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install google-cloud-bigquery==3.3.2

Update python version, alpine version and py3-apache-arrow version to install later versions. This is the latest one on the time of writing.

And make sure to remove build dependencies (build-base, linux-headers) for your release docker. I prefer multistage dockers for this.

Upvotes: 1

pmo511
pmo511

Reputation: 619

Looks like an incompatibility issue with the latest version of google-cloud-bigquery (>3) and numpy:

ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects

Try specifying a previous version, this works for me:

RUN pip3 install google-cloud-bigquery==2.34.4

Upvotes: 4

Related Questions