Reputation: 99
I am trying to scrape https://www.autotrader.com/model-information and get all combinations make-model of dropdown values (such as Audi-a4, Audi-a6, etc).
I can click and select the values from the first dropdown (car make dropdown) without a problem. But when I try to click and select the value in the second dropdown (model dropdown), TimeoutException error shows up.
[error like this][1] [1]: https://i.sstatic.net/mLM5h.png
Looks like the second dropdown cannot be clicked thus the option values cannot be showed while using chromedriver. When I browse the website and click the first dropdown myself, the second dropdown become clickable immediately and all values show up in the dropdown.
I do not know how to fix this since the url, ID, XPATH, nothing change after choosing values of the first dropdown.
It may be something to do with anti-blocking. Maybe the website recognize I am a bot and block me from scraping.
Here is my code:
driver = webdriver.Chrome(executable_path='/opt/anaconda3/bin/chromedriver')
### paste chromedriver path as executable_path here
driver.maximize_window()
url = "https://www.autotrader.com/model-information"
driver.get(url)
makelist=[]
modellist=[]
#### First work with the drop down menus with car makes
driver.find_element_by_xpath('//*[@id="makeCode"]').click()
make = driver.find_elements_by_xpath('//*[@id="makeCode"]/optgroup[@label="All Makes"]/option')
#### Loop through all makes in the drop down menus
for makeele in make:
makelist.append(makeele.get_attribute('text'))
makeele.click()
time.sleep(3)
#### Work with the drop down menus with car models of specific make
wait = WebDriverWait(driver, 20)
wait.until(EC.element_to_be_clickable((By.XPATH,'//*[@id="ModelCode"]'))).click()
model = driver.find_elements_by_xpath('//*[@id="ModelCode"]/optgroup[2]/option')
#### Loop through all models in the drop down menus
for modelele in model:
modellist.append(modelele.get_attribute('text'))
modelele.click()
time.sleep(3)
I appreciate any suggestion.
Upvotes: 2
Views: 389
Reputation: 99
I figure out how to work out this problem.
The reason why the second dropdown menu cannot be clicked is because the website detect I am a scraper and block me. My approach is to use a real chrome website instead of using a simulation.
## Generate changing userAgent to prevent block
cmd1 = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
cmd2 = ' --remote-debugging-port=9222'
subprocess.Popen([cmd1, cmd2])
time.sleep(3)
options = Options()
ua = UserAgent()
userAgent = ua.random
print(userAgent)
options.add_argument(f'user-agent={userAgent}')
options.add_experimental_option('debuggerAddress', '127.0.0.1:9222')
driver = webdriver.Chrome(options = options, executable_path='/opt/anaconda3/bin/chromedriver') ### paste chromedriver path as executable_path here
driver.maximize_window()
url = "https://www.autotrader.com/model-information"
driver.get(url)
Other parts of the code remain the same.
Upvotes: 0
Reputation: 4212
I think the site blocks this dropdown from enabling details. So the problem is not even your code, but the fact that you are opening the site as a robot.
I debugged it and I never had this field enabled (probably only once), even after a car make was selected. It just remains disabled
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-blink-features")
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver', options=chrome_options)
driver.maximize_window()
url = "https://www.autotrader.com/model-information"
driver.get(url)
wait = WebDriverWait(driver, 15)
makelist = []
modellist = []
#### First work with the drop down menus with car makes
make_dd = driver.find_element_by_xpath('//*[@id="makeCode"]')
model_dd = driver.find_element_by_xpath('//*[@id="ModelCode"]')
makes = driver.find_elements_by_xpath('//*[@id="makeCode"]/optgroup[@label="All Makes"]/option')
models = driver.find_elements_by_xpath('//*[@id="ModelCode"]/optgroup[2]/option')
#### Loop through all makes in the drop down menus
# for i in range(len(makes)):
# n = 1
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="makeCode"]')))
make_dd.click()
wait.until(EC.element_to_be_clickable((By.XPATH, f'//*[@id="makeCode"]/optgroup[1]/option[{1}]')))
make = driver.find_element_by_xpath(f'//*[@id="makeCode"]/optgroup[1]/option[1]')
make.click()
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ModelCode>optgroup[Label='All Models']")))
model_dd.click()
What this code does:
Also note that I got rid of any time.sleep() and put explicit waits instead.
P.S. I suggest dividing big problems into a smaller ones and solve them one after one.
I did not want to answer first after seeing that I was blocked, but finally decided to add some suggestions.
Upvotes: 1