vboxer00
vboxer00

Reputation: 125

Selenium/Chromedriver/Chromium(86) issues AWS Lambda

I've been dealing with this issue for the past week and can't get my head around it so I decided to ask for help. I'm trying to run Selenium in AWS Lambda using a Chromium 86 build. The error message I'm keep getting is the following:

{
  "errorMessage": "Message: unknown error: Chrome failed to start: exited abnormally.\n  (chrome not reachable)\n  (The process started from chrome location /opt/bin/chromium is no longer running, so ChromeDriver is assuming that Chrome has crashed.)\n",
  "errorType": "WebDriverException"
}

Here's my build:

Selenium 3.14
Chromium 86.0.4240.0 (https://github.com/vittorio-nardone/selenium-chromium-lambda/blob/master/chromium.zip) which is forked from (https://github.com/puppeteer/puppeteer)
Chromedriver 86.0.4240.22.0 (https://chromedriver.storage.googleapis.com/index.html?path=86.0.4240.22/)

Here's my code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    chrome_options = webdriver.ChromeOptions()
#   chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--headless')
    chrome_options.add_argument("start-maximized")
    chrome_options.add_argument("disable-infobars")
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--window-size=1024x768')
    chrome_options.add_argument('--user-data-dir=/tmp/user-data')
    chrome_options.add_argument('--profile-directory=/tmp')
    chrome_options.add_argument('--hide-scrollbars')
    chrome_options.add_argument('--enable-logging')
    chrome_options.add_argument('--log-level=0')
    chrome_options.add_argument('--v=99')
#   chrome_options.add_argument('--single-process')
    chrome_options.add_argument('--data-path=/tmp/data-path')
    chrome_options.add_argument('--ignore-certificate-errors')
    chrome_options.add_argument('--homedir=/tmp')
    chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir')
    chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.3163.100 Safari/537.36')
    chrome_options.add_argument('--remote-debugging-port=9222')
    chrome_options.binary_location = "/opt/bin/chromium"

    driver = webdriver.Chrome(executable_path="/opt/bin/chromedriver",options=chrome_options)
    driver.get('https://www.google.com/')

The things I have tried so far:

  1. Tried various runtimes Python 3.6, 3.7, 3.8 no success
  2. Tried with and without Lambda layers. When trying with Lambda layer by folder structure is relatively simple:
.
├── bin
│   ├── chromedriver (binary)
│   └── chromium (binary)
└── python
    ├── selenium
    ├── selenium-3.14.0.dist-info
    ├── urllib3
    └── urllib3-1.26.7.dist-info
  1. Gone through majority of the comments here in SO where similar issues have been discussed examples:

Chrome Driver and Chromium Binaries are not working on aws lambda

WebDriverException: Message: unknown error: Chrome failed to start: crashed error using ChromeDriver Chrome through Selenium Python on Amazon Linux ..etc

  1. Tried almost all combinations of the arguments that I'm passing to the chromedriver like w/ & w/o --disable-dev-shm-usagem, w/ & w/o --disable-gpu etc.

The only thing I noticed is if I play with certain arguments sometimes it throws the selenium.common.exceptions.WebDriverException: Message: unknown error: unable to discover open window in chrome error instead of the Chrome failed to start: exited abnormally one. As a last idea I have I was thinking of compiling my own Chromium 86 build. Has there been anyone who managed to get build 86 or higher running on AWS Lambda?

Upvotes: 2

Views: 5394

Answers (1)

vboxer00
vboxer00

Reputation: 125

UPDATE 1/2/2022

I pretty much spent the last couple of days trying to figure out what could be the problem with my entire setup. Is it the code? The way I use lambda/layers? Binaries? Runtime env? Too many moving parts and I didn't want to fallback to Chromium 6x (that was my last working setup) as that's very ancient and certain features that I needed were not present..like features of the Chrome DevTools Protocol.

Then I stumbled across this repository which talks about how to utilise Amazon ECS with Lambda:

https://github.com/umihico/docker-selenium-lambda

Basically in a couple of minutes I was able to setup my container image linked to Lambda and it's running:

  1. Python 3.9.8
  2. Chromium 96.0.4664.0
  3. Chromedriver 96.0.4664.45
  4. Selenium 4.1.0

Then I ported over my function code and with a couple of changes I managed to get it working, finally! Here are my workings args:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-dev-tools')
chrome_options.add_argument('--remote-debugging-port=9222')
chrome_options.add_argument('--window-size=1280x1696')
chrome_options.add_argument('--user-data-dir=/tmp/chrome-user-data')
chrome_options.add_argument('--single-process')
chrome_options.add_argument("--no-zygote")
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.binary_location = "/opt/chrome/chrome"

driver = webdriver.Chrome
driver = webdriver.Chrome("/opt/chromedriver",options=chrome_options)
driver.get('https://www.google.com/')

The main difference between this setup and a pure Lambda one that with this you utilise ECS (container based) images and you are not running headless-chrome or serverless-chrome but you are running your daemon from chrome snapshots.

https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html

Upvotes: 3

Related Questions