Reputation: 25
I wrote a script that uses the selenium and pyautogui modules to login and scrape a value from an element and print it, but it's printing two dashes --
.
Here is the HTML that contains the value 417 which I want to retrieve:
<p id="totReqCountVal" class="trailer-0 avenir-regular font-size-4 text-green js-total-requests">417</p>
This is the relevant code I have tried:
from selenium import webdriver
from selenium.webdriver.common.by import By
browser.get('website_to_be_scraped')
browser.find_element(By.ID, 'totReqCountVal')
I then tried:
views = browser.find_element(By.ID, 'totReqCountVal')
print(views)
which returns:
(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")
With some help I have also tried the following:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
Using XPATH and get_attribute("innerHTML"):
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))
added the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I have checked through devtools if the locator strategies identifies the element uniquely, checked for iframes, and shadow root.
How do I retrieve the 417 value?
Upvotes: 1
Views: 1494
Reputation: 193058
views
is the WebElement which on printing rightly prints:
(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")
To print the text 417 you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
Using XPATH and get_attribute("innerHTML")
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Link to useful documentation:
get_attribute()
method Gets the given attribute or property of the element.
text
attribute returns The text of the element.
Upvotes: 1