Reputation: 87
I am trying to extract the text "Margaret Osbon" from HTML below via Python with Selenium. But I keep getting blank values when I print. I have tried get_attribute Still getting blank values when I print
<div class="author-info hidden-md">
By (author)
<span itemprop="author" itemtype="http://schema.org/Person" itemscope="Margareta Osborn">
<a href="/author/Margareta-Osborn" itemprop="url">
<span itemprop="name">
Margareta Osborn</span>
</a>
</span>
</div>
Below is my code for Python
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time"
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.bookdepository.com/")
keyword = "9781925324402"
Search = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//*[@id="book-search-form"]/div[1]/input[1]'))
)
Search.clear()
Search.send_keys(keyword)
Search.send_keys(Keys.RETURN)
try:
authors = driver.find_element_by_xpath("//div[@class='author-info hidden-md']/span/a/span").text
print(authors)
driver.quit()
except:
authors = "Not Available"
print(authors)
driver.quit()
Upvotes: 1
Views: 1072
Reputation: 29382
You need to call the .text
method which is present in the Selenium Python binding.
.text
is present for web element
authors = driver.find_element_by_xpath("//div[@class='author-info hidden-md']/span/a/span").text
print(authors)
or
authors = driver.find_element_by_xpath("//a[contains(@href,'/author/Margareta-Osborn')]").get_attribute('innerHTML')
print(authors)
Update 1 :
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get("https://www.bookdepository.com/Rose-River-Margareta-Osborn/9781925324402")
authors = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.author-info.hidden-md span[itemprop='author'] span"))).text
print(authors)
Upvotes: 1
Reputation: 33384
To get the value from span. Use WebDriverWait()
and wait for visibility_of_element_located()
and following css selector
.
and use either .text
or .get_attribute("textContent"))
driver.get('https://www.bookdepository.com/Rose-River-Margareta-Osborn/9781925324402')
print(WebDriverWait(driver,5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.author-info.hidden-md [ itemprop="author"]'))).text)
print(WebDriverWait(driver,5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.author-info.hidden-md [ itemprop="author"]'))).get_attribute("textContent"))
you need to import below libraries.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
Upvotes: 0
Reputation: 174
You are missing ".text" to get the value and maybe because of that you are getting some junk value. I am thinking that you are receiving just a reference ID for that.
Using .text -
#Get Element using Xpath
element = //span[@itemprop='name']
#Fetch using the driver findElement
author = driver.find_element_by_xpath(element).text
#Print the text
print(author)
Using JavaScriptExecutor -
driver.execute_script('return arguments[0].innerText;', element)
Using Get Attribute -
driver.find_element_by_xpath(element).get_attribute('innerText')
Upvotes: 0