Reputation: 427
The HTML looks
<p class="rating item-rating">
<picture>
<source srcset="/assets/img/ratings/rating-4_5.svg" type="image/svg+xml"/>
<img src="/assets/img/ratings/rating-4_5.png"/>
</picture>
<span>
260
</span>
</p>
And I would like to get
/assets/img/ratings/rating-4_5.png
How should I improve the following code?
img = soup.findAll('p',attrs={'class':'rating item-rating'})
for i in img:
print(i.picture)
Upvotes: 0
Views: 1134
Reputation: 4744
You can get src
value in img
tag easily like :
import requests
from bs4 import BeautifulSoup
r = """<p class="rating item-rating">
<picture>
<source srcset="/assets/img/ratings/rating-4_5.svg" type="image/svg+xml"/>
<img src="/assets/img/ratings/rating-4_5.png"/>
</picture>
<span>
260
</span>
</p>"""
source = BeautifulSoup(r,'html')
img = source.findAll('p',attrs={'class':'rating item-rating'})
for parsing in img:
print(parsing.img['src'])
Upvotes: 1
Reputation:
You need to get to the img
tag as that seems to hold the information you want in the src
attribute.
from bs4 import BeautifulSoup
s = '''<p class="rating item-rating">
<picture>
<source srcset="/assets/img/ratings/rating-4_5.svg" type="image/svg+xml"/>
<img src="/assets/img/ratings/rating-4_5.png"/>
</picture>
<span>
260
</span>
</p>'''
soup = BeautifulSoup(s, 'html.parser')
for p in soup.select('p.rating'):
print(p.picture.img['src'])
Upvotes: 2