cerebrou
cerebrou

Reputation: 5540

Get text inside the href link inside the span marker using Selenium

How to extract the text which is displayed as part of the link inside the span marker.

<span class="pull-left w-100 font30 medium_blue_type mb10"><a href='/XLY'>XLY</a></span> <span class="w-100">Largest Allocation</span>

Output:

XLY

I've tried several approaches, among all, using

elems = driver.find_elements_by_class_name("span.pull-left.w-100.font30.medium_blue_type.mb10")
elems = driver.find_element_by_xpath('.//span[@class = "pull-left w-100 font30 medium_blue_type mb10"]')

but can't get it working. The website is https://www.etf.com/stock/TSLA.

EDIT: Is it possible to do it without opening the window in the browser, e.g. using "headless" option?

op = webdriver.ChromeOptions()
op.add_argument('headless')
driver = webdriver.Chrome(CHROME_DRIVER_PATH, options=op)

Upvotes: 0

Views: 973

Answers (3)

KunduK
KunduK

Reputation: 33384

Use the following xpath to identify the href link.

//div[./span[text()='Largest Allocation']]//a

You need to induce some delay to get the element. Use WebDriverWait() and wait for visibility of the element.

To get the text:

print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH, "//div[./span[text()='Largest Allocation']]//a"))).text)

To get the href:

print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH, "//div[./span[text()='Largest Allocation']]//a"))).get_attribute("href"))

you need to import below libraries.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Upvotes: 0

cruisepandey
cruisepandey

Reputation: 29362

If you prefer to have a text-based locators, you can use the below:

//span[text()='Largest Allocation']/../span
  1. You should click on the cookies I understand button first.
  2. Make use of explicit waits.

So your effective code would be:

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 30)

driver.get("https://www.etf.com/stock/TSLA")

try:
    wait.until(EC.element_to_be_clickable((By.LINK_TEXT, "I Understand"))).click()
    print("Clicked on I understand button")
except:
    pass

txt = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[text()='Largest Allocation']/../span"))).text
print(txt)

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Output:

Clicked on I understand button
XLY

Process finished with exit code 0

If you are looking for locators not based on text, use the below line of code:

txt = wait.until(EC.visibility_of_element_located((By.XPATH, "(//span[contains(@class,'medium_blue_type')]//a)[2]"))).text

Upvotes: 1

Prophet
Prophet

Reputation: 33361

There are several possible problems here:

  1. Maybe you are missing a delay
  2. The locator you are using may be not unique
  3. I can see here you are extracting the attribute value from the returned web element
  4. The web element can be inside iframe etc.
    Based on currently shared information you can try adding a wait and extracting the web element value as following:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 20)

href = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[@class = "pull-left w-100 font30 medium_blue_type mb10"]"))).get_attribute("href")

Upvotes: 0

Related Questions