komko
komko

Reputation: 112

How to get src attribute from <image/> with Python

I am scraping data from one site, and I need to find one img. I get it but the output is not what I need.

I have tried looking online for solutions, changing code but nothing worked.

r = requests.get(baseurl)
content = r.content
soup = BeautifulSoup(content, "html.parser")

images = soup.findAll('img')[1]
print(images)

Output I get:

<img src="https://cdn.rubyrealms.com/images/WKpivrdGBJJ9p6etIY2aJpixikFj4vnpmpPR9pXjK4Y8K.png" style="border-radius: 5px"/>

Output I need:

cdn.rubyrealms.com/images/WKpivrdGBJJ9p6etIY2aJpixikFj4vnpmpPR9pXjK4Y8K.png

(I tried print(images.text))

Upvotes: 3

Views: 16861

Answers (2)

0xPrateek
0xPrateek

Reputation: 1188

you can get the img tag's src content using ;

images = soup.findAll('img')[1]
print(images.get("src"))

or

images = soup.findAll('img')[1]
print(images['src'])

Output

https://cdn.rubyrealms.com/images/WKpivrdGBJJ9p6etIY2aJpixikFj4vnpmpPR9pXjK4Y8K.png

The problem with print(images.text) is that it is used to extract the text in between two tags and you want to extract the text which is inside the tag itself.

Hope this helps you :)

Upvotes: 4

Jo&#227;o Teixeira
Jo&#227;o Teixeira

Reputation: 69

Here's a sample you can adapt:

parser.feed('<img src="python-logo.png" alt="The Python logo">')
Start tag: img
attr: ('src', 'python-logo.png')

REFERENCE: https://docs.python.org/3/library/html.parser.html

Upvotes: 1

Related Questions