Joyce
Joyce

Reputation: 435

Python Selenium get attribute 'href' error

I am trying to get href from the link, please find my codes:

url ='http://money.finance.sina.com.cn/bond/notice/sz149412.html'
link = driver.find_element_by_xpath("//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']").get_attribute('href')
print(link)

error

 invalid selector: Unable to locate an element with the xpath expression 
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//div[@class='blk01'])//ul/li[3]//a[contains(text(),'发行信息']' is not a valid XPath expression.

Seems it is not a valid xpath, but I cannot figure out the error, any help will be appreciated!

Thanks

Upvotes: 1

Views: 363

Answers (5)

chitown88
chitown88

Reputation: 28595

Any particular reason to use Selenium here? It's present in the html source, so would be more efficient to use requests and beautifulsoup.

import requests
from bs4 import BeautifulSoup

url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')


a_tag = soup.select_one('a:contains("发行信息")') 
#a_tag = soup.select_one('a:-soup-contains("发行信息")') # <- depending what version of bs4 you have, the above may throw error since it's depricated

link = a_tag['href']

Ouput:

print(link)
http://money.finance.sina.com.cn/bond/issue/sz149412.html

Upvotes: 0

vitaliis
vitaliis

Reputation: 4212

//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']

does not seem to be a stable xpath and also you mess up with ' and ". This is the main problem.

Try this first:

find_element_by_xpath('//div[@class="blk01"])//ul//li[3]//a[contains(text(),"发行信息"]')

If it works, try just:

find_element_by_xpath('//a[contains(text(),"发行信息"]')

The goal is to make xpath as short as possible.

Upvotes: 0

Adil kasbaoui
Adil kasbaoui

Reputation: 663

# Importing necessary modules
from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time

# WebDriver Chrome
driver = webdriver.Chrome(ChromeDriverManager().install())

# Target URL
url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
driver.get(url)
time.sleep(5)
link = driver.find_element_by_xpath('//*[@class="blue" and contains(text(),"发行信息")]').get_attribute('href')
print(link)

Upvotes: 0

Arundeep Chohan
Arundeep Chohan

Reputation: 9969

//a[contains(text(),'发行信息')]

Even this would work.

print(link.get_attribute("href"))

Upvotes: 2

George Imerlishvili
George Imerlishvili

Reputation: 1957

try this instead:

link = driver.find_element_by_xpath('//div[@class="blk01"]//ul//li[3]//a[contains(text(), "发行信息")]')
print(link.get_attribute("href"))


Upvotes: 1

Related Questions