Jake Nguyen
Jake Nguyen

Reputation: 71

Can't extract src attribute from "img" tag with BeautifulSoup

I'm working on a project and I'm trying to extract the pictures' URL from a website. I'm a noob at this so please bear with me. Based on the HTML code, the class of the pictures that I want is "fotorama__img". However, when I execute my code, it doesn't seem to work. Anyone knows why that's the case? Also, how come the src attribute doesn't contain the whole URL, just a part of it? Example: the link to the image is https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_callout_front.jpg but the src attribute of the img tag is "/files_SYS/images/System/sysThumb/SYS-120U-TNR_main.png".

Here is my code:

from bs4 import BeautifulSoup
import requests 

page = requests.get("https://www.supermicro.com/en/products/system/Ultra/1U/SYS-120U-TNR")
soup = BeautifulSoup(page.content,'lxml')
images = soup.find_all("img", {"class": "fotorama__img"})
for image in images:
    print(image.get("src"))

And here is the picture of the HTML code for the page enter image description here

Thank you for your help!

Upvotes: 1

Views: 892

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195438

The class is added dynamically via JavaScript, so beautifulsoup doesn't see it. To extract the images from this site, you can do:

import requests
from bs4 import BeautifulSoup

page = requests.get(
    "https://www.supermicro.com/en/products/system/Ultra/1U/SYS-120U-TNR"
)
soup = BeautifulSoup(page.content, "lxml")
images = [
    "https://www.supermicro.com" + a["href"]
    for a in soup.select(".fotorama > a")
]

print(*images, sep="\n")

Prints:

https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_main.png
https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_callout_angle.jpg
https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_callout_top.jpg
https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_callout_front.jpg
https://www.supermicro.com/files_SYS/images/System/SYS-120U-TNR_callout_rear.jpg

Upvotes: 1

Related Questions