John Carraher
John Carraher

Reputation: 63

pyppeteer: Browser closed unexpectedly on Python 3.9 AWS Lambda Function


I have a python function that scrapes a website's schedule and uploads it to a RDS database. The code works perfect on my local machine (pyppeteer Version: 2.0.0, Python 3.12). However I've been trying to port it over to AWS Lambda and my browser keeps on failing to launch.

I used THIS repository's chromium executable (extracted from the /bin of npm i chrome-aws-lambda@~2.0.2 and uploaded to an S3 bucket with appropriate permissions) which corresponded to the pyppeteer version I installed with my lambda function (pip3 install pyppeteer -t .). The python code first downloads the chromium instance into the /tmp directory and then attempts to launch the browser from it with pyppeteer. My lambda runtime is stuck on Python 3.9 because its the latest available version supported by "psycopg2._psycopg". Plus I don't think that's the issue.

Does anyone have any ideas as to why my browser fails to launch within my AWS Lambda runtime? I think it might be a problem with my arguments for the launch() function, but I'm unsure where to go from here.

Error Line:

browser = await launch(
            headless=True, 
            args=[
            '--no-sandbox',
            '--disable-setuid-sandbox',
            '--disable-gpu',
            "--single-process",
            "--disable-dev-shm-usage",
            "--no-zygote",
            ],
            executablePath="/tmp/headless-chromium",
            userDataDir="/tmp",
        )

Error in Lambda Console:

Status: Failed
Test Event Name: testEvent

Response:
{
  "errorMessage": "Browser closed unexpectedly:\n",
  "errorType": "BrowserError",
  "requestId": "a6a20222-6082-4618-b388-fbd4c88bda7d",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 369, in lambda_handler\n    shows = asyncio.run(scrape_all_schedules())\n",
    "  File \"/var/lang/lib/python3.9/asyncio/runners.py\", line 44, in run\n    return loop.run_until_complete(main)\n",
    "  File \"/var/lang/lib/python3.9/asyncio/base_events.py\", line 647, in run_until_complete\n    return future.result()\n",
    "  File \"/var/task/lambda_function.py\", line 225, in scrape_all_schedules\n    browser = await launch(\n",
    "  File \"/var/task/pyppeteer/launcher.py\", line 307, in launch\n    return await Launcher(options, **kwargs).launch()\n",
    "  File \"/var/task/pyppeteer/launcher.py\", line 168, in launch\n    self.browserWSEndpoint = get_ws_endpoint(self.url)\n",
    "  File \"/var/task/pyppeteer/launcher.py\", line 227, in get_ws_endpoint\n    raise BrowserError('Browser closed unexpectedly:\\n')\n"
  ]
}

Function Logs:
Request ID: a6a20222-6082-4618-b388-fbd4c88bda7d

FULL CODE:

import asyncio
from datetime import datetime
from pyppeteer import launch
import os
import psycopg2
import boto3


async def scrape_all_schedules():
    current_day = datetime.now()

    download_chromium()
    
    chromium_path = '/tmp/headless-chromium'
    if os.path.exists(chromium_path):
        print("Chromium binary found. Launching browser...")
    else:
        print(f"Error: Chromium binary not found at {chromium_path}")

    browser = await launch(
        headless=True, 
        args=[
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-gpu',
        "--single-process",
        "--disable-dev-shm-usage",
        "--no-zygote",
        ],
        executablePath="/tmp/headless-chromium",
        userDataDir="/tmp",
    )
    page = await browser.newPage()


def lambda_handler(event, context):
    print("Starting scraping process...")
    shows = asyncio.run(scrape_all_schedules())
    

    return {
        'statusCode': 200,
        'body': f"Successfully saved {len(shows)} shows to the database."
    }

Upvotes: 1

Views: 124

Answers (1)

John Carraher
John Carraher

Reputation: 63

From my understanding AWS Lambda runs "Amazon Linux 2023" and does not allow installation of the system level libraries needed to run headless pyppeteer (yum install -y libX11 libX11-devel libXcomposite libXcursor libXdamage libXrandr libXi libXtst libXScrnSaver). I was able to get my script running on an E2C instance instead and would recommend others who face this problem to do the same.

Upvotes: 0

Related Questions