Reputation: 699
I used webdriver Chrome to scrape data from a website, but I don't know how to extract data from a href.
HTM:
<div class="buySearchResultContent">
<ul id="CARS_LIST_DATA">
<li class="seo_list" data-seo_name="440285">
<div class="buySearchResultContentImg">
<a href="carinfo-333285.php">
<img src="carpics/9400180056/290x200/20180305101502854_4567823.jpg" srcset="carpics/9400180056/290x200/20180305101502854_9098765.jpg 290w, carpics/9400180056/435x300/20180305101502854_00000.jpg 435w , carpics/9400180056/720x520/20180305101502854_00001.jpg 720w" sizes="(min-width: 992px) 75vw, 90vw" alt="auto">
</a>
My code:
driver = webdriver.Chrome("C:/chromedriver.exe")
url = "https://www.asdf.com.tw/price-02.php?v=3&brand=lisa&model=lulu&year1=2009&year2=2018&page=1"
driver.get(url)
content=driver.find_element_by_class_name('buySearchResultContentImg')
print(content)
What I want to extract is "carinfo-333285.php". Thanks!
Upvotes: 1
Views: 295
Reputation: 4739
I don't know too much about python, Please try this
Jpg_href= driver.find_element_by_xpath("//div[@class='buySearchResultContentImg']/a[@href='carinfo-333285.php']").get_attribute("href")
Upvotes: 0
Reputation: 2003
Try the following code:
from selenium.common.exceptions import NoSuchElementException
try:
a_element = driver.find_element_by_xpath('//div[contains(@class,
"buySearchResultContentImg")]/a[@href]')
link = a_element.get_attribute("href")
except NoSuchElementException:
link = None
Upvotes: 4
Reputation: 193308
As per the HTML you have provided to extract the href
attribute you can use either of the following Locator Strategies :
css_selector
:
myHref = driver.find_element_by_css_selector("div.buySearchResultContentImg > a").get_attribute("href")
xpath
:
myHref = driver.find_element_by_xpath("//div[@class='buySearchResultContentImg']/a").get_attribute("href")
Upvotes: 1