Reputation: 31
I'm running a web app on Azure from a Docker container based on a Selenium image (selenium/standalone-chrome:latest). It ran perfectly fine, but out of nowhere (after changing something unrelated in the data handling section separate from my scraper) started giving me the following error: "Unable to discover proper chromedriver version in offline mode".
The weird thing is that my API is still running fine online; I can get and post requests and from my logs I can see they're received and handled properly up until the chromedriver is initiated (which fails).
The error occurs here during the instantiation of the driver:
# import chromedriver_binary
from selenium.webdriver import Chrome, ChromeOptions
def _GetDriver() -> Chrome:
options = ChromeOptions()
options.add_argument("--headless")
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')
return Chrome(options=options) # <--- Error happens here.
def _EnrichAtomicAddress(info: dict) -> dict:
with _GetDriver() as driver: # <--- Only place _GetDriver is called.
data = XXXXXX(driver, info)
data['lastScrapedDate'] = date.today()
data['retrievalDate'] = date.today()
if 'errorMessage' in data:
return data
data.update(XXXXX(driver, data))
return data
My Dockerfile:
FROM selenium/standalone-chrome:latest
LABEL authors="Robert"
# Set the working directory to /app
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Install any needed packages specified in requirements.txt
RUN sudo apt-get install -y python3
RUN sudo apt-get update && sudo apt-get install -y python3-pip
RUN sudo pip install --no-cache-dir -r requirements.txt
# Ports
EXPOSE 443
EXPOSE 80
# Define environment variable
ENV FLASK_APP function_app.py
# Run the Flask app
# CMD ["flask", "run", "--host=0.0.0.0"]
CMD ["flask", "run"]
# ENTRYPOINT ["top", "-b"]`
I've tried:
yet to no avail.
What is causing this and how can I fix this? Thanks in advance! <3
Upvotes: 3
Views: 1127
Reputation: 1
Docker selenium automatically defines the mode to be offline according to the source.
Variable:
SE_OFFLINE="true" \
Upvotes: 0