Reputation: 435
I am trying to get href from the link, please find my codes:
url ='http://money.finance.sina.com.cn/bond/notice/sz149412.html'
link = driver.find_element_by_xpath("//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']").get_attribute('href')
print(link)
error
invalid selector: Unable to locate an element with the xpath expression
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//div[@class='blk01'])//ul/li[3]//a[contains(text(),'发行信息']' is not a valid XPath expression.
Seems it is not a valid xpath, but I cannot figure out the error, any help will be appreciated!
Thanks
Upvotes: 1
Views: 363
Reputation: 28595
Any particular reason to use Selenium here? It's present in the html source, so would be more efficient to use requests
and beautifulsoup
.
import requests
from bs4 import BeautifulSoup
url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
a_tag = soup.select_one('a:contains("发行信息")')
#a_tag = soup.select_one('a:-soup-contains("发行信息")') # <- depending what version of bs4 you have, the above may throw error since it's depricated
link = a_tag['href']
Ouput:
print(link)
http://money.finance.sina.com.cn/bond/issue/sz149412.html
Upvotes: 0
Reputation: 4212
//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']
does not seem to be a stable xpath and also you mess up with ' and ". This is the main problem.
Try this first:
find_element_by_xpath('//div[@class="blk01"])//ul//li[3]//a[contains(text(),"发行信息"]')
If it works, try just:
find_element_by_xpath('//a[contains(text(),"发行信息"]')
The goal is to make xpath
as short as possible.
Upvotes: 0
Reputation: 663
# Importing necessary modules
from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time
# WebDriver Chrome
driver = webdriver.Chrome(ChromeDriverManager().install())
# Target URL
url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
driver.get(url)
time.sleep(5)
link = driver.find_element_by_xpath('//*[@class="blue" and contains(text(),"发行信息")]').get_attribute('href')
print(link)
Upvotes: 0
Reputation: 9969
//a[contains(text(),'发行信息')]
Even this would work.
print(link.get_attribute("href"))
Upvotes: 2
Reputation: 1957
try this instead:
link = driver.find_element_by_xpath('//div[@class="blk01"]//ul//li[3]//a[contains(text(), "发行信息")]')
print(link.get_attribute("href"))
Upvotes: 1