Reputation: 61
Currently using Python and Selenium to scrape data, export to a CSV and then manipulate as needed. I am having trouble grasping how to build xpath statements to access specific text elements on a dynamically generated page.
https://dutchie.com/embedded-menu/revolutionary-clinics-somerville/menu
From the above page I would like to export the category (not part of each product, but a parent element) followed by all the text fields associated to a product card.
The following statement allows me to pull all the titles (sort of) under the "Flower" category, but from that I am unable to access all child text elements within that product, only a weird variation of title. The xpath approach seems to be ideal as it allows me to pull this data without having to scroll the page with key passes/javascript.
products = driver.find_elements_by_xpath("//div[text()='Flower']/following-sibling::div/div")
for product in products:
print ("Flower", product.text)
What would I add to the above statement if I wanted to pull the full set of elements that contains text for all children within the 'consumer-product-card__InViewContainer', within each category...such as flower, pre-rolls and so on. I expiremented with different approaches last night and different paths/nodes/predicates to try and access this information building off the above code but ultimately failed.
Also is there a way for me to test or visualize in some way "where I am" in terms of scope of a given xpath statement?
Thank you in advance!
Upvotes: 0
Views: 102
Reputation: 1836
I have tried some code for you please take a look and let me know if it resolves your problem.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 60)
driver.get('https://dutchie.com/embedded-menu/revolutionary-clinics-somerville/menu')
All_Heading = wait.until(
EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,\"products-grid__ProductGroupTitle\")]")))
for heading in All_Heading:
driver.execute_script("return arguments[0].scrollIntoView(true);", heading)
print("------------- " + heading.text + " -------------")
ChildElement = heading.find_elements_by_xpath("./../div/div")
for child in ChildElement:
driver.execute_script("return arguments[0].scrollIntoView(true);", child)
print(child.text)
Please find the output of the above code -
Hope this is what you are looking for. If it solve you query then please mark it as answer.
Upvotes: 1