gkasap
gkasap

Reputation: 21

Selenium Python confuses elements which are not existing

I am scraping a supermarket
(https://www.sklavenitis.gr/eidi-artozacharoplasteioy/psomi-typopoiimeno/) I have the urls on a file and read them and iterate through them. But the selenium is confused about some elements which are not existing e.g.

on the second link which is on sale the price is correct but afterwards like the third or fourth link which are not in discount it still prints the value of the second link and again when something is in discount all is okay, but if is not in discount get the previous discounted value. Should I close my driver in every iteration and reopen it? I saw some posts and they didn't do that for that I am asking

options = webdriver.ChromeOptions()
delay = 10
options.add_argument(f'user-agent={user_agent}')
options.add_argument("start-maximized")
options.add_argument('--no-sandbox')
options.add_argument('--disable-infobars')
options.add_argument("--headless")
options.add_argument('--disable-dev-shm-usage')
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
s = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s, options=options)
for link in links:
    driver.get(link)
    myElem =WebDriverWait(theDriver,
    delay).until(EC.presence_of_element_located((By.XPATH, "//h1[@class='product- 
    detail__title']")))
    driver.find_element(by=By.XPATH, value="//div[@class='product-detail__left']").find_element(by=By.XPATH, value="//div[@class='deleted__price']").text.replace(",", ".")

Edit#1 Sorry for the confusion I want to scrape the prices before the discount and after the discount.( If there is not a discount it just print the current price). My problem is on the items which are not in discount, selenium shows that somehow there is a discounts for that reason I added the class product detail left to be more strict about scraping but it didn't work.

Upvotes: 0

Views: 42

Answers (1)

undetected Selenium
undetected Selenium

Reputation: 193308

To extract the pricess of each item using Selenium and in a single line of code inducing WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("textContent"):

    driver.get('https://www.sklavenitis.gr/eidi-artozacharoplasteioy/psomi-typopoiimeno/')
    WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.nvcookies__right button.nvcookies__button.nvcookies__button--primary.consent-give"))).click()
    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.price")))])
    
  • Using XPATH and text attribute:

    driver.get('https://www.sklavenitis.gr/eidi-artozacharoplasteioy/psomi-typopoiimeno/')
    WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.nvcookies__right button.nvcookies__button.nvcookies__button--primary.consent-give"))).click()
    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='price']")))])
    
  • Console Output:

    ['1,29 €/τεμ.', '1,45 €/τεμ.', '0,78 €/τεμ.', '1,82 €/τεμ.', '1,82 €/τεμ.', '1,57 €/τεμ.', '2,24 €/τεμ.', '2,62 €/τεμ.', '2,38 €/τεμ.', '1,32 €/τεμ.', '1,88 €/τεμ.', '2,24 €/τεμ.', '2,08 €/τεμ.', '1,18 €/τεμ.', '1,67 €/τεμ.', '2,08 €/τεμ.', '0,78 €/τεμ.', '2,57 €/τεμ.', '1,52 €/τεμ.', '1,62 €/τεμ.', '2,28 €/τεμ.', '2,05 €/τεμ.', '2,38 €/τεμ.', '2,40 €/τεμ.']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Upvotes: 1

Related Questions