saraherceg
saraherceg

Reputation: 341

Why can't I get all the images links via response.css?

I am trying to extract all the links to images but I can only extract the main picture on each of the properties page using

response.css('div.col-sm-12 img.visible-print-block::attr(src)').get()

Apart from that when I try to extract the rest of the images using this code I get an empty array. how to fix this?


class WebBox2Spider(scrapy.Spider):
    def parse(self, response):
        for prop in response.css('div.grid-item'):
            link = prop.css('div.property-image a::attr(href)').get()
            yield scrapy.Request(
                link,
                callback=self.get_loc,
                meta={'item': {
                    'url': link,
                }},
            )

    def get_loc(self, response):
        item = response.meta.get('item')

        pics_link =  response.css('div.gallery img::attr(src)').getall()

        item['images'] = pics_link

        return item

--------------------------------------------------------------------

class CapeWaterfrontSpider(WebBox2Spider):
    name = "cape_waterfront_estates"
    start_urls = ['https://www.capewaterfrontestates.co.za/template/Properties.vm/listingtype/SALES',
                  'https://www.capewaterfrontestates.co.za/template/Properties.vm/listingtype/MONTHLY_RENTAL']

Upvotes: 0

Views: 74

Answers (1)

Wim Hermans
Wim Hermans

Reputation: 2116

You can check using scrapy shell what the html looks like for scrapy. The content you're trying to get is loaded dynamically, so you'll have to adapt your selector to, for example: pics_link = response.xpath('//*[@data-nav="thumbs"]//@data-full').extract()

Upvotes: 1

Related Questions