Reputation: 171
I want to scrape all the links on a webpage which is tag and class="author track". There are multiple URLs like this on a web page but when i run the program, my list is empty
Example of one HTML:-
<a class="author track" href="/nileshkikuuchise" data-gaq="author" data-dmc="entry-artist">
<img class="avatar" src="https://ctl.s6img.com/cdn/s6-original-art-uploads/society6/uploads/u/nileshkikuuchise/avatar_asset/5323d6c4d92143e8b37f0fa644d7044f_p3.jpg" width="20" height="20" data-dmc="entry-photo">
Nileshkikuuchise </a>
My code:-
discover_page = BeautifulSoup(r.text, 'html.parser')
finding_accounts = discover_page.find_all("a", "[class~=author track]")
print(finding_accounts)
and output is none
How do i get the href value to the list?. I can do the for loop later but need to get the basics correct first
Upvotes: 2
Views: 54
Reputation: 7206
You seem to have a mix of the style expected by select
and by find_all
.
These two methods work for me:
>>> r = '''
<a class="author track" href="/nileshkikuuchise" data-gaq="author" data-dmc="entry-artist">
<img class="avatar" src="https://ctl.s6img.com/cdn/s6-original-art-uploads/society6/uploads/u/nileshkikuuchise/avatar_asset/5323d6c4d92143e8b37f0fa644d7044f_p3.jpg" width="20" height="20" data-dmc="entry-photo">
Nileshkikuuchise </a>
'''
>>> discover_page = BeautifulSoup(r, 'html.parser')
>>> discover_page.find_all("a", class_="author track")
[<a class="author track" data-dmc="entry-artist" data-gaq="author" href="/nileshkikuuchise">
<img class="avatar" data-dmc="entry-photo" height="20" src="https://ctl.s6img.com/cdn/s6-original-art-uploads/society6/uploads/u/nileshkikuuchise/avatar_asset/5323d6c4d92143e8b37f0fa644d7044f_p3.jpg" width="20"/>
Nileshkikuuchise </a>]
>>> discover_page.select('a[class="author track"]')
[<a class="author track" data-dmc="entry-artist" data-gaq="author" href="/nileshkikuuchise">
<img class="avatar" data-dmc="entry-photo" height="20" src="https://ctl.s6img.com/cdn/s6-original-art-uploads/society6/uploads/u/nileshkikuuchise/avatar_asset/5323d6c4d92143e8b37f0fa644d7044f_p3.jpg" width="20"/>
Nileshkikuuchise </a>]
Upvotes: 1