Reputation: 964
I am working on a selenium project. In the project, I am trying to scrape a particular element from the website. The element has no class or ID associated with it. So I am kind of stuck on how to extract that detail.
This is the website
In the website, if you look at the HTML markup for specifications, there is a div
with contents <b>Form</b>: Liquid
. I want to extract the 'Liquid'.
this is my code so far
def extract():
form_element = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//b[text()='Form']/")))
form_text = form_element.text
return form_text
This is resulting in a TimeOutException
. I am not sure what I am doing wrong.
PS: I was able to click the show more
button on the page to display the specifications area with selenium. Just in case you are wondering, that is not the problem.
Upvotes: 0
Views: 1587
Reputation: 978
When we try to get elements by locators ID is unique ones, if you dont have Id You can go with class name ,xpath and linktext
Use this xapth:
//*[contains(text(),'Liquid')]
Upvotes: 1
Reputation: 5075
You can do that by setting to driver = webdriver.Chrome()
{say if you are using chrome and you have webdriver for chrome installed} and writing the next line as; driver.find_element_by_tag_name("h1")
[say if you wanted to extract details about h1 element and use that element.].Hope i understood your question correctly.
Upvotes: 0
Reputation: 33384
To get the value Liquid
you need to click on Show more
button first and then wait for visibility_of_element_located()
for the element on the page.You can use following approach to get the value.
Using Split
()
driver.get("https://www.target.com/p/hawaiian-punch-fruit-juicy-red-1-gal-bottle/-/A-13051948")
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[@data-test='toggleContentButton' and contains(.,'Show more')]"))).click()
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//div[./b[text()='Form:']]"))).text.split("Form:")[-1])
Using Java Scripts Executor
driver.get("https://www.target.com/p/hawaiian-punch-fruit-juicy-red-1-gal-bottle/-/A-13051948")
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[@data-test='toggleContentButton' and contains(.,'Show more')]"))).click()
print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//div[./b[text()='Form:']]")))))
Upvotes: 0
Reputation: 7563
Get the div
parent from the elements you want using this xpath:
//b[text()='Form:']//parent::div
And to grab the text it seem like you have to using .get_attribute('innerHTML')
instead of .text
Try following code:
def extract():
form_element = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//b[text()='Form:']//parent::div")))
form_text = form_element.get_attribute('innerHTML').split("</b>",1)[1]
return form_text
Upvotes: 1