Sadman Hasan
Sadman Hasan

Reputation: 140

Scraped img src outputting into base64

I'm trying to scrape just the https:// link:

src ="https://static.daraz.com.bd/p/apple-1088-5942-1-catalog.jpg"

from the below code using BeautifulSoup4 Python library.

<div class="image-wrapper default-state">
      <img class="lazy image -loaded" alt="Macbook Air (MD711ZA/B) - Aluminum - Laptop - Dual-Core Intel Core i5 - 4GB RAM - 128GB HDD - 11.6&amp;#039;&amp;#039; LED - Intel HD Graphics 5000 - Mac OS X Mountain Lion 10.8" data-image-vertical="1" width="176" height="220" src="https://static.daraz.com.bd/p/apple-1088-5942-1-catalog.jpg" data-sku="AP113ELAA1XBNAFAMZ" data-placeholder="placeholder_daraz.jpg" style="display: inline-block;">
      <noscript>&lt;img src="https://static.daraz.com.bd/p/apple-1088-5942-1-catalog.jpg" width="176" height="220" class="image" /&gt;
      </noscript>
</div>

But I'm getting output like this:

data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7

Any way to get the original src link ?

BeautifulSoup code :

for image in soup.findAll('div', attrs={'class': 'image-wrapper default-state'}):
            print image.img['src']

The same code is working in other sites and getting the src link. But only here it's outputting into base64 format.

Upvotes: 2

Views: 2179

Answers (1)

Sadman Hasan
Sadman Hasan

Reputation: 140

Converted the whole img tag into string and then I found out that the tag they were using was <data-img src=" ">

So then I just simply used that tag and got my expected output.

for image in soup.findAll('div', attrs={'class': 'image-wrapper'}):
    print image.img['data-src']

Upvotes: 1

Related Questions