Richard
Richard

Reputation: 7433

Puppeteer can run on Docker locally, but not on Cloud Run (Puppeteer + Cloud Run)

I am running Puppeteer to convert a template HTML to PDF to be saved on Google Cloud Storage. This is my code snippet to do that.

  const puppeteer = require('puppeteer')
  const fs = require('fs')
  const path = require('path')

  static async getCertificate(req) {
    const { id: userProgressId } = req.params
    try {
      // Some logic to handle existing certificate (omitted)

      const filePath = path.resolve(__dirname, './sample_html.html')
      const html = compileCertificateTemplateHtml('certificate_template.hbs') // This will generate HTML as string (working as intended and tested)
      const browser = await puppeteer.launch({
        headless: 'new',
        args: ['--disable-dev-shm-usage', '--no-sandbox'],
      })
      const page = await browser.newPage()
      await page.setContent(html, { waitUntil: 'networkidle0' })
      const pdfBuffer = await page.pdf({
        landscape: true,
        printBackground: true,
        width: 681,
      })

      // Some logic to save buffer to cloud (omitted)
    } catch (e) {
      // Some logic to handle errors (omitted)
    }
  }

Initially, running this code on Docker doesn't work because Chrome (or Chromium) wasn't installed. To install that, I followed the troubleshooting section from Puppeteer here. After modifying my Dockerfile to the following, I can run the backend service on my local Docker Engine. Tried hitting the endpoint and it returns the expected result.

FROM node:16

# Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)
# Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteer
# installs, work.
RUN apt-get update \
    && apt-get install -y wget gnupg \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
      --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

# Create app directory
WORKDIR /usr/src/app

# Uncomment to skip the chromium download when installing puppeteer. If you do,
# you'll need to launch puppeteer with:
#     browser.launch({executablePath: 'google-chrome-stable'})
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true

# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm@5+)
COPY package*.json ./

RUN npm install
# If you are building your code for production
# RUN npm ci --only=production

# Bundle app source
COPY . .

EXPOSE 8080
CMD [ "node", "./bin/www" ]

However, when I built an image using Cloud Build and deployed it to Cloud Run, hitting the same endpoint returned the initial error message (when I didn't write instructions to install the dependencies required to run Headless Chrome in Dockerfile). The error message is specifically the following.

Could not find Chrome (ver. 113.0.5672.63). This can occur if either\n 1. you did not perform an installation before running the script (e.g. `npm install`) or\n 2. your cache path is incorrectly configured (which is: /home/.cache/puppeteer).\nFor (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.

I looked for other similar issues and tried them locally, but none of them worked even locally (I didn't bother trying them on Cloud Run). Here are some other alternatives I have tried.

In case this is needed, this is my .yaml file to build the image on Cloud Run.

...
default:
  image: google/cloud-sdk:alpine

.before_staging_script_template:
  before_script:
    - gcloud config set project $STAGING_GCP_PROJECT_ID
    - gcloud auth activate-service-account
      --key-file $STAGING_GCP_SERVICE_KEY

.before_production_script_template:
  before_script:
    - gcloud config set project $PRODUCTION_GCP_PROJECT_ID
    - gcloud auth activate-service-account
      --key-file $PRODUCTION_GCP_SERVICE_KEY

#### STAGING
staging_build:
  stage: build
  extends: .before_staging_script_template
  script:
    - gcloud builds submit
      --tag gcr.io/$STAGING_GCP_PROJECT_ID/$STAGING_SERVICE_NAME
  only:
    - staging
  when: manual
...

What should I do to make Puppeteer run successfully on the Docker Engine in Cloud Run?

Upvotes: 1

Views: 1179

Answers (1)

koma
koma

Reputation: 6566

Puppeteer was working fine here for over a year on cloud run until it suddenly stopped working and was hanging on browser.newPage()

After a long search, adding --disable-gpu did the trick. This is my startup sequence now :

async function startBrowser(proxyUrl) {
    puppeteer.use(StealthPlugin());

    const puppeteer_args = [
        '--no-sandbox', 
        '--single-process',
        '--disable-gpu',
        '--disable-dev-shm-usage'];
    if (proxyUrl) {
        console.info(`Using proxy ${proxyUrl}`);
        puppeteer_args.push(`--proxy-server=${proxyUrl}`);
        puppeteer_args.push(`--ignore-certificate-errors`);
    }

    console.info(`puppeteer browser starting ...`)

    const browser = await puppeteer.launch({
        args: puppeteer_args,
        headless: process.env.HEADLESS === "TRUE",
        ignoreHTTPSErrors: true,
        executablePath: executablePath(),
    });
    console.info(`puppeteer browser has been started`)

    return browser;
}

Upvotes: 1

Related Questions