giulio di zio
giulio di zio

Reputation: 301

Web scraping with selenium and python - xpath with contains text

I will try to make it really short. I am trying to click on a product that came out of a search from a website. Basically there is a list of matching products, and I want to click on the first one which contains the product name I searched in its title. I will post the link of the website so you can inspect its DOM structure: https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and In this case, many contain my query string, and I would simply like to click on the first one.

Here is the snippet of code I wrote for this:

def click_on_first_matching_product(self):
        first_product = WebDriverWait(self.driver, 6).until(
            EC.visibility_of_all_elements_located((By.XPATH, f"//a[@class='df-card__main']/div/div[@class=df-card__title] and contains(text(), '{self.product_code}')"))
        )[0]
        first_product.click()

The problem is that 6 seconds go by and it cant find an element that satisfies the xPath condition i wrote, but I cant figure out how to make it work. I am trying to get a search result a element and check if the title it has down its structure contains the query string I searched. Can I have some help and an explanation please? I am quite new to selenium and XPaths...

Can I please also have a link to a reliable selenium documentation? I am having some hard times trying to find a good one. Maybe one that also explains how to make conditions for xPaths please.

Upvotes: 3

Views: 410

Answers (2)

undetected Selenium
undetected Selenium

Reputation: 193318

You need to consider a couple of things. Your use-case would be either to click on the first search result or to click on the item with respect to the card title. In case of clicking on a definite WebElement inducing WebDriverWait for visibility_of_all_elements_located() will be too expensive.


To click on the item with respect to the card title you have to induce WebDriverWait for the element_to_be_clickable() and you can use the following based Locator Strategies:

  • Using the text CE285A Toner Compatibile Per Hp LaserJet P1102 directly:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='CE285A Toner Compatibile Per Hp LaserJet P1102']"))).click()
    
  • Using a variable for the text through format():

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    text = "CE285A Toner Compatibile Per Hp LaserJet P1102"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='{}']".format(text)))).click()
    
  • Using a variable for the text through %s:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    text = "CE285A Toner Compatibile Per Hp LaserJet P1102"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[text()='%s']"% str(text)))).click()
    

To click on first search product you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:

  • CSS_SELECTOR:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.df-card>a"))).click()
    
  • XPATH:

    driver.get('https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='df-card']/a"))).click()
    

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Upvotes: 1

KunduK
KunduK

Reputation: 33384

Your xpath seems incorrect.Try following xpath to click on product.

driver.get("https://www.tonercartuccestampanti.it/#/dfclassic/query=CE285A&query_name=match_and")
def click_on_first_matching_product(product_code):
    first_product = WebDriverWait(driver, 6).until(EC.visibility_of_all_elements_located((By.XPATH,"//div[@class='df-card__title' and contains(text(), '{}')]".format(product_code))))[0]
    first_product.click()
click_on_first_matching_product("CE285A")

Upvotes: 1

Related Questions