Reputation: 3176
I want to extract the pictures' widths and heights using Bueatiful Soup. All pictures have the same code format:
<img src="http://somelink.com/somepic.jpg" width="200" height="100">
I can extract the links easily with
for pic in soup.find_all('img'):
print (pic['src'])
But
for pic in soup.find_all('img'):
print (pic['width'])
is not working for extracting sizes. What am I missing?
EDIT: One of the pictures in the page does not have the width and height in the html code. Did not notice this at the time of the initial post. So any solution must take this into account
Upvotes: 4
Views: 5138
Reputation: 19763
Try this:
>>> html = '<img src="http://somelink.com/somepic.jpg" width="200" height="100">'
>>> soup = BeautifulSoup(html)
>>> for tag in soup.find_all('img'):
... print tag.attrs.get('height', None), tag.attrs.get('width', None)
...
100 200
you can use attrs method, it returns a dict , keys as attribute of tag and values as tag value .
Upvotes: 1
Reputation: 474191
The dictionary-like attribute access should work for width
and height
as well, if they are specified. You might encounter images that don't have these attributes explicitly set - your current code would throw a KeyError
in this case. You can use get()
and provide a default value instead:
for pic in soup.find_all('img'):
print(pic.get('width', 'n/a'))
Or, you can find only img
elements that have the width
and height
specified:
for pic in soup.find_all('img', width=True, height=True):
print(pic['width'], pic['height'])
Upvotes: 4
Reputation: 2088
It works a little differently, to get other attributes
for pic in soup.find_all('img'):
print(pic.get('width'))
Upvotes: 1