Reputation: 93
<div class="cont">
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
</div>
I try to get all the src values from this HTML.
My code is:
soup = BeautifulSoup(source, "html.parser")
div = soup.find("div", {"class": "cont"})
imgs = div.find_all("img", {"src":True})
print(imgs)
The list returned from this code contains tag and other attributes such as "alt". How can I extract only the values of the src attributes (e.g., '/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg')
Upvotes: 1
Views: 71
Reputation: 8302
using find_all
from bs4 import BeautifulSoup
soup = BeautifulSoup(source, "html.parser")
div = soup.find("div", {"class": "cont"})
print([img['src'] for img in div.find_all("img")])
output,
['/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg',
'/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg',
'/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg']
Upvotes: 2
Reputation: 12499
Try adding for loop, Example
for img in imgs:
print(img['src'])
Or to make it more simple
from bs4 import BeautifulSoup
html = """
<div class="cont">
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
<p style="text-align: center; "><img alt="" src="/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg"></p>
</div>
"""
soup = BeautifulSoup(html, features='html.parser')
elements = soup.select('div.cont > p > img')
for element in elements:
print(element['src'])
Prints out
/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg
/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg
/web/upload/NNEditor/20200409/1_1_shop1_143320.jpg
if you are trying to download images, see example
https://stackoverflow.com/a/61531668/4539709
Upvotes: 3