echo
echo

Reputation: 813

Python Selenium web driver with chrome driver gets detected

I assumed that the chrome browsing session opened by Selenium would be the same as the google chrome local installation. But when I try to search on this website, even just open it with selenium and manually control the search process, I will get an error message where as the search result returns fine when I use regular chrome with my own profile or in incognito window. Whenever I search on this issue, I find results stating mouse movements or clicking pattern gives it away. But it is not the case as I tried manually control after opening the browser. Something in the html request gives it away. Is there anyway to overcome that? The website in question is: https://www.avnet.com/wps/portal/us

The error message when in automated session. enter image description here

Upvotes: 4

Views: 8326

Answers (2)

undetected Selenium
undetected Selenium

Reputation: 193298

As per the the website in question https://www.avnet.com/wps/portal/us I am not sure about the exact issue you are facing perhaps your code block would have given us some more leads whats wrong happening. However I am am able to access the mentioned url just fine :

Code Block :

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://www.avnet.com/wps/portal/us')
print("Page Title is : %s" %driver.title)

Console Output :

Page Title is : Avnet: Quality Electronic Components & Services

Snapshot :

Avnet: Quality Electronic Components & Services


Update

I had a relook at the issue you are facing. I have read the entire HTML DOM and have found no traces of Bot Detection mechanisms. Had there been any Bot Detection mechanisms implemented the website even wouldn't have allowed you to traverse/scrape the DOM Tree even to find the Search Box even.

Further debugging the issue the following are my observations :

  • Through your Automated Script you can proceed till sending the searchtext to the Search Box successfully.

searchbox

  • While manually you search for a valid product, the auto-suggestions are displayed through a <span> tag as the html below and you can click on any of the auto-suggestions to browse to the specific product.

  • Auto Suggestions :

span-yes-parts

  • SPAN tag HTML :

<span id="auto-suggest-parts-dspl">
                                                                
                                                                
                                                                <p class="heading">Recommended Parts</p>
                                                                <dl class="suggestion">
                                                                                <div id="list_1" onmouseover="hoverColor(this)" onmouseout="hoverColorOut(this)" class="">
																					<div class="autosuggestBox">
																						<a href="/shop/us/products/aimtec/am8tw-4805dz-3074457345627076774/?categoryId=&amp;fromPage=autoSuggest" rel="nofollow" id="autosuggest_1" class="autosuggest_link" onkeydown="scrollDown(event,this)">AM8TW-4805DZ</a>
																						<p class="desc1">Aimtec</p>
																						<p class="desc2">Module DC-DC 2-OUT 5V/-5V 0.8A/-0.8A 8W 9-Pin DIP Tube</p>
																					</div>
                                                                                </div>

  • This <span> is basically not getting triggered/generated when we are using the WebDriver.

spanTagNotTraggered

  • In absence of the Auto Suggestions if you force a search that results into the error page.

Conclusion

The main issue seems to be either with the form-control class or with the function scrollDown(event,this) associated with onkeydown event.

Upvotes: 5

James W.
James W.

Reputation: 3065

#TooLongForComment

To reproduce this issue

from random import randint
from time import sleep
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions()
options.add_argument('--disable-infobars')
options.add_argument('--disable-extensions')
options.add_argument('--profile-directory=Default')
options.add_argument('--incognito')
options.add_argument('--disable-plugins-discovery')
options.add_argument('--start-maximized')
browser = webdriver.Chrome('./chromedriver', chrome_options=options)
browser.get('https://www.avnet.com/wps/portal/us')

try:
    search_box_id = 'searchInput'
    myElem = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.ID, search_box_id)))
    elem = browser.find_element_by_id(search_box_id)
    sleep(randint(1, 5))
    s = 'CA51-SM'
    for c in s:  # randomize key pressing
        elem.send_keys(c)
        sleep(randint(1, 3))
    elem.send_keys(Keys.RETURN)
except TimeoutException as e:
    pass
finally:
    browser.close()

reproduced

I've used hexedit to edit chromedriver key from $cdc_ to fff..

hexedit

  • Investigate how it's done by reading every JavaScript block, look at this answer for detection example

  • Try to add extension to modify headers and mask under Googlebot by changing user-agent & referrer options.add_extension('/path-to-modify-header-extension')

Upvotes: 3

Related Questions