BeautifulSoup .get not returning 'href'

Question

I am working on creating a web-scraping tool that will download articles to txt files. I have created the soup with bs4 and pulled out the specific piece of html that contains the desired url for the article I want to download:

>>>prevLink = soup2.select('.previous_post')
>>>prevLink
[Previous Post: An Interview With The Man Who Never Needed a Real Job]

So far so good (I think). Then I try to use .get('href') to pull out the link, but it returns 'none'.

>>>print(prevLink[0].get('href'))
None

When I use .get('class') to select for the class, however, it seems to work.

>>> print(prevLink[0].get('class'))
['previous_post']

I don't understand why .get('class') is acting differently than .get('href'). Thanks for looking.

alecxe · Accepted Answer

prevLink is not actually referencing a link, but span element.

Just get deeper to the a element with your selector:

prevLink = soup2.select_one('.previous_post > a')
print(prevLink.get('href'))

BeautifulSoup .get not returning 'href'

Answers (1)

Related Questions

BeautifulSoup .get not returning &#39;href&#39;

Answers (1)

Related Questions

BeautifulSoup .get not returning 'href'