Reputation: 90
I am trying to scrape all href
tags that contains ^/album$
. When I print out my result, I get an empty list. I have tried find()
and findAll()
with re.compile
and re.search
. I am unable to get anything other than an empty list.
Code:
vk_urls = soup.find_all('a')
vk_albums = soup.findAll(text='^/album$')
print(vk_albums)
Result:
[]
Desired Result:
/album...
/album...
/album...
Upvotes: 0
Views: 67
Reputation: 76
You need to use href=
instead of text=
(or string=
in Beautiful Soup 4) to filter by content of href
attribute. The latter (i.e. text
and string
) are used to search for strings within tags.
To find all anchor tags with an href
attribute that contains /album
, you need to do the following:
vk_albums = soup.find_all("a", href=re.compile("^/album"))
print(vk_albums)
You can then loop through this list to print just the href
attributes:
for album in vk_albums:
print(album['href'])
Upvotes: 1