Reputation: 379
I am using below code for chrome --headless mode but code is not executing correctly. code is working fine in normal mode.
def instagram_login():
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome("/home/chromedriver", options=chrome_options)
driver.get('https://www.instagram.com/')
driver.maximize_window()
driver.implicitly_wait(20)
form = driver.find_element_by_xpath("//*[@class='HmktE']")
usrinput = form.find_element_by_name("username")
usrinput.clear()
usrinput.send_keys("xxxxxx")
usrpwd = form.find_element_by_name("password")
usrpwd.clear()
usrpwd.send_keys("xxxxx")
time.sleep(2)
loginbt = form.find_elements_by_tag_name('button')
loginbt[1].click()
time.sleep(5)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[1]/section/main/div/div/div/div/button"))).click()
time.sleep(2)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Not Now']"))).click()
return driver
Please find the error below:
Traceback (most recent call last):
File "/home/Instagram/insta.py", line 539, in <module>
(driver, postauth, hlist) = get_instalinks(x)
File "/home//PycharmProjects(SEP)/Instagram/insta.py", line 76, in get_instalinks
driver = instagram_login()
File "/home/Instagram/insta_.py", line 56, in instagram_login
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Not Now']"))).click()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Upvotes: 1
Views: 5256
Reputation: 737
The problem is with your User-Agent. Some websites check you user agent when you browse to reduce the use of scrapers. If they notice anything suspicious, they will limit (or fully restrict) your activity on such page.
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/96.0.4664.45 Safari/537.36
Instagram recognizes a faulty user agent and restricts access. You should implement the following Chrome Option to evade this restriction:
chrome_options.add_argument("USER AGENT")
Replacing the above "USER AGENT" with the contents seen from this link: My User Agent
Further more, for an additional layer of added security, I recommend following the contents of this article on how to make your scraper as undetectable as possible when browsing in headless mode.
Upvotes: 4