Aakash Sharma
Aakash Sharma

Reputation: 67

Unable to fetch HREF using beautiful soup

Unable to fetch the link from the href tag using beautiful soup.

I have provided the html structure below. Tried various extracting logic but the code is returning blank all the time Please advise

<div class="review_list_pagination">
<p class="page_link review_next_page">
      <a href="/reviews/in/hotel/best-western-star-residency.html" 
           id="review_next_page_link">Next page </a>
 </p>
</div>

Tried

link = soup.find_all(attrs={"class": "page_link review_next_page"})

link = soup.find_all('p', attrs = {'class': 'page_link review_next_page'})

Result:

[<p class="page_link review_next_page"><a href="/reviews/in/hotel/best-western-star-residency.html?page=2&amp;" id="review_next_page_link">Next page</a></p>, 
<p class="page_link review_next_page"> <a href="/reviews/in/hotel/best western-star-residency.html?page=2&amp;" id="review_next_page_link">Next page</a></p>]

But print(link[0].get('href'))

Result: Blank

Expected: /reviews/in/hotel/best-western-star-residency.html?page=2&amp;

Upvotes: 2

Views: 161

Answers (3)

Adam Williamson
Adam Williamson

Reputation: 295

There are lots of different ways to tackle this one, I landed on the below. Hope that helps.

link = soup.find("p",{"class":"page_link review_next_page"}).a['href']

Upvotes: 0

Jack Fleeting
Jack Fleeting

Reputation: 24930

For the sake of future generations (:D), you can also use either of these:

soup3.select('a[id="review_next_page_link"]')[0]['href']

  #or

soup3.select_one('a[id="review_next_page_link"]')['href']

  #or

soup3.select('#review_next_page_link')[0]['href']

... and I'm sure there are more ways to do this. They all output:

'/reviews/in/hotel/best-western-star-residency.html'

Upvotes: 0

nimishxotwod
nimishxotwod

Reputation: 335

Try the following:

link = find('a', {"id": "review_next_page_link"})["href"]

What you are getting is a p tag from the soup. You can not get a property of the inner a tag from the p tag you are finding.

The line above will find the tag with id =review_next_page_link, and you can simply get its href value.

Upvotes: 2

Related Questions