pyppeteer: Browser closed unexpectedly on Python 3.9 AWS Lambda Function

Question

I have a python function that scrapes a website's schedule and uploads it to a RDS database. The code works perfect on my local machine (pyppeteer Version: 2.0.0, Python 3.12). However I've been trying to port it over to AWS Lambda and my browser keeps on failing to launch.

I used THIS repository's chromium executable (extracted from the /bin of npm i chrome-aws-lambda@~2.0.2 and uploaded to an S3 bucket with appropriate permissions) which corresponded to the pyppeteer version I installed with my lambda function (pip3 install pyppeteer -t .). The python code first downloads the chromium instance into the /tmp directory and then attempts to launch the browser from it with pyppeteer. My lambda runtime is stuck on Python 3.9 because its the latest available version supported by "psycopg2._psycopg". Plus I don't think that's the issue.

Does anyone have any ideas as to why my browser fails to launch within my AWS Lambda runtime? I think it might be a problem with my arguments for the launch() function, but I'm unsure where to go from here.

Error Line:

browser = await launch(
            headless=True, 
            args=[
            '--no-sandbox',
            '--disable-setuid-sandbox',
            '--disable-gpu',
            "--single-process",
            "--disable-dev-shm-usage",
            "--no-zygote",
            ],
            executablePath="/tmp/headless-chromium",
            userDataDir="/tmp",
        )

Error in Lambda Console:

Status: Failed
Test Event Name: testEvent

Response:
{
  "errorMessage": "Browser closed unexpectedly:
",
  "errorType": "BrowserError",
  "requestId": "a6a20222-6082-4618-b388-fbd4c88bda7d",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 369, in lambda_handler
    shows = asyncio.run(scrape_all_schedules())
",
    "  File \"/var/lang/lib/python3.9/asyncio/runners.py\", line 44, in run
    return loop.run_until_complete(main)
",
    "  File \"/var/lang/lib/python3.9/asyncio/base_events.py\", line 647, in run_until_complete
    return future.result()
",
    "  File \"/var/task/lambda_function.py\", line 225, in scrape_all_schedules
    browser = await launch(
",
    "  File \"/var/task/pyppeteer/launcher.py\", line 307, in launch
    return await Launcher(options, **kwargs).launch()
",
    "  File \"/var/task/pyppeteer/launcher.py\", line 168, in launch
    self.browserWSEndpoint = get_ws_endpoint(self.url)
",
    "  File \"/var/task/pyppeteer/launcher.py\", line 227, in get_ws_endpoint
    raise BrowserError('Browser closed unexpectedly:\n')
"
  ]
}

Function Logs:
Request ID: a6a20222-6082-4618-b388-fbd4c88bda7d

FULL CODE:

import asyncio
from datetime import datetime
from pyppeteer import launch
import os
import psycopg2
import boto3


async def scrape_all_schedules():
    current_day = datetime.now()

    download_chromium()
    
    chromium_path = '/tmp/headless-chromium'
    if os.path.exists(chromium_path):
        print("Chromium binary found. Launching browser...")
    else:
        print(f"Error: Chromium binary not found at {chromium_path}")

    browser = await launch(
        headless=True, 
        args=[
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-gpu',
        "--single-process",
        "--disable-dev-shm-usage",
        "--no-zygote",
        ],
        executablePath="/tmp/headless-chromium",
        userDataDir="/tmp",
    )
    page = await browser.newPage()


def lambda_handler(event, context):
    print("Starting scraping process...")
    shows = asyncio.run(scrape_all_schedules())
    

    return {
        'statusCode': 200,
        'body': f"Successfully saved {len(shows)} shows to the database."
    }

pyppeteer: Browser closed unexpectedly on Python 3.9 AWS Lambda Function

Answers (1)

Related Questions