Fazlul Hoque Sawrav
Fazlul Hoque Sawrav

Reputation: 127

How can I get the all of the img src by using scrapy?

try to do in scrapy shell

>>>scrapy shell 'https://www.trendyol.com/trendyolmilla/cok-renkli-desenli-elbise-twoss20el0573-p-36294862'
>>> response.css("div.slick-slide img").xpath("@src").getall()

Output is :

['/Content/images/defaultThumb.jpg', '/Content/images/defaultThumb.jpg', '/Content/images/defaultThumb.jpg', '/Content/images/defaultThumb.jpg', '/Content/images/defaultThumb.jpg', 'https://cdn.dsmcdn.com/mnresize/415/622/ty124/product/media/images/20210602/12/94964657/64589619/1/1_org_zoom.jpg', 'https://cdn.dsmcdn.com/mnresize/415/622/ty124/product/media/images/20210602/12/94964657/64589619/1/1_org_zoom.jpg']

only collect one image but in provided link have 5 image. Please help me to out this problem. How to find all of the image src.

Upvotes: 0

Views: 1025

Answers (1)

Shivam
Shivam

Reputation: 620

Explanation

Actually, you are trying to fetch the data from HTML tag which contains only one link. In order to grab all the link you have to fetch from script tag.

This will return json string which will be stored in text variable

text = response.xpath("//p/script[contains(@type,'application/ld+json')]/text()").extract_first()

Load it to convert into python dictionary

json_text = json.loads(text)

Now, pass the key json_text.get('image') to get the images.

Code

Execute this code on scrapy. Output will give you all the 5 links

from scrapy import Request


class Trendyol(scrapy.Spider):
    name = 'test'

    def start_requests(self):
        url = 'https://www.trendyol.com/trendyolmilla/cok-renkli-desenli-elbise-twoss20el0573-p-36294862'
        yield Request(url=url, callback=self.parse)

    def parse(self, response):
        text = response.xpath("//p/script[contains(@type,'application/ld+json')]/text()").extract_first()
        json_text = json.loads(text)

        print(json_text.get('image'))

Upvotes: 1

Related Questions