violetl
violetl

Reputation: 99

Python Selenium Could not select dependent second dropdown after selecting first dropdown

I am trying to scrape https://www.autotrader.com/model-information and get all combinations make-model of dropdown values (such as Audi-a4, Audi-a6, etc).

I can click and select the values from the first dropdown (car make dropdown) without a problem. But when I try to click and select the value in the second dropdown (model dropdown), TimeoutException error shows up.

[error like this][1] [1]: https://i.sstatic.net/mLM5h.png

Looks like the second dropdown cannot be clicked thus the option values cannot be showed while using chromedriver. When I browse the website and click the first dropdown myself, the second dropdown become clickable immediately and all values show up in the dropdown.

I do not know how to fix this since the url, ID, XPATH, nothing change after choosing values of the first dropdown.

It may be something to do with anti-blocking. Maybe the website recognize I am a bot and block me from scraping.

Here is my code:

driver = webdriver.Chrome(executable_path='/opt/anaconda3/bin/chromedriver')   
 ### paste chromedriver path as executable_path here
driver.maximize_window()
url = "https://www.autotrader.com/model-information" 
driver.get(url)

makelist=[]
modellist=[]
#### First work with the drop down menus with car makes 
driver.find_element_by_xpath('//*[@id="makeCode"]').click()
make = driver.find_elements_by_xpath('//*[@id="makeCode"]/optgroup[@label="All Makes"]/option')
#### Loop through all makes in the drop down menus
for makeele in make:
    makelist.append(makeele.get_attribute('text'))
    makeele.click()
    time.sleep(3)
    
    #### Work with the drop down menus with car models of specific make
    wait = WebDriverWait(driver, 20)
    wait.until(EC.element_to_be_clickable((By.XPATH,'//*[@id="ModelCode"]'))).click()
    model = driver.find_elements_by_xpath('//*[@id="ModelCode"]/optgroup[2]/option')
    #### Loop through all models in the drop down menus
    for modelele in model:
        modellist.append(modelele.get_attribute('text'))
        modelele.click()
        time.sleep(3)

I appreciate any suggestion.

Upvotes: 2

Views: 389

Answers (2)

violetl
violetl

Reputation: 99

I figure out how to work out this problem.

The reason why the second dropdown menu cannot be clicked is because the website detect I am a scraper and block me. My approach is to use a real chrome website instead of using a simulation.

## Generate changing userAgent to prevent block
cmd1  = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
cmd2 = ' --remote-debugging-port=9222'
subprocess.Popen([cmd1, cmd2])
time.sleep(3)
options = Options()
ua = UserAgent()
userAgent = ua.random
print(userAgent)
options.add_argument(f'user-agent={userAgent}')
options.add_experimental_option('debuggerAddress', '127.0.0.1:9222')
driver = webdriver.Chrome(options = options, executable_path='/opt/anaconda3/bin/chromedriver')    ### paste chromedriver path as executable_path here
driver.maximize_window()
url = "https://www.autotrader.com/model-information" 
driver.get(url)

Other parts of the code remain the same.

Upvotes: 0

vitaliis
vitaliis

Reputation: 4212

I think the site blocks this dropdown from enabling details. So the problem is not even your code, but the fact that you are opening the site as a robot.

I debugged it and I never had this field enabled (probably only once), even after a car make was selected. It just remains disabled

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-blink-features")
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver', options=chrome_options)

driver.maximize_window()
url = "https://www.autotrader.com/model-information"
driver.get(url)
wait = WebDriverWait(driver, 15)

makelist = []
modellist = []
#### First work with the drop down menus with car makes

make_dd = driver.find_element_by_xpath('//*[@id="makeCode"]')
model_dd = driver.find_element_by_xpath('//*[@id="ModelCode"]')
makes = driver.find_elements_by_xpath('//*[@id="makeCode"]/optgroup[@label="All Makes"]/option')
models = driver.find_elements_by_xpath('//*[@id="ModelCode"]/optgroup[2]/option')

#### Loop through all makes in the drop down menus
# for i in range(len(makes)):
#     n = 1

wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="makeCode"]')))
make_dd.click()


wait.until(EC.element_to_be_clickable((By.XPATH, f'//*[@id="makeCode"]/optgroup[1]/option[{1}]')))
make = driver.find_element_by_xpath(f'//*[@id="makeCode"]/optgroup[1]/option[1]')
make.click()

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ModelCode>optgroup[Label='All Models']")))
model_dd.click()

What this code does:

  1. Clicks the first dropdown
  2. Selects the first option (Acura)
  3. Waits for model dropdown and tries to click it. I could developed it further if someone would find a workaround for bot detection.

Also note that I got rid of any time.sleep() and put explicit waits instead.

P.S. I suggest dividing big problems into a smaller ones and solve them one after one.

I did not want to answer first after seeing that I was blocked, but finally decided to add some suggestions.

Upvotes: 1

Related Questions