SIM
SIM

Reputation: 22440

Scraper clicking on the same link cyclically?

I've written some script in python using selenium to scrape name and price of different products from redmart website. My target is to click on each category among 10 in the upper side of the main page and parse all the products going to the target page. However, when a category is clicked, the browser is on newly opened page so at this point it is necessary to get to the main page again to click another one among 10 category links. My scraper clicks on a link, goes to its target page, parses data from there, gets back to the main page and clicks on the same link and does the rest over and over again. Here is the script I'm trying with:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://redmart.com/bakery")
wait = WebDriverWait(driver, 10)

while True:
    try:
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "li.image-facets-pill")))
        driver.find_element_by_css_selector('img.image-facets-pill-image').click()          
    except:
        break

    for elems in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "li.productPreview"))):
        name = elems.find_element_by_css_selector('h4[title] a').text
        price = elems.find_element_by_css_selector('span[class^="ProductPrice__"]').text
        print(name, price)

    driver.back()

driver.quit()   

Btw, I think it is necessary to tune up the "try" and "except" block in this script to get the desired output.

Upvotes: 1

Views: 776

Answers (1)

Andersson
Andersson

Reputation: 52665

You can implement simple counter that will allow you to iterate through list of categories as below:

counter = 0

while True:

    try:
        wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "li.image-facets-pill")))
        driver.find_elements_by_css_selector('img.image-facets-pill-image')[counter].click()      
        counter += 1    
    except IndexError:
        break  

    for elems in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "li.productPreview"))):
        name = elems.find_element_by_css_selector('h4[title] a').text
        price = elems.find_element_by_css_selector('span[class^="ProductPrice__"]').text
        print(name, price)

    driver.back()

driver.quit() 

Upvotes: 1

Related Questions