Reputation: 341
I am trying to extract all the links to images but I can only extract the main picture on each of the properties page using
response.css('div.col-sm-12 img.visible-print-block::attr(src)').get()
Apart from that when I try to extract the rest of the images using this code I get an empty array. how to fix this?
class WebBox2Spider(scrapy.Spider):
def parse(self, response):
for prop in response.css('div.grid-item'):
link = prop.css('div.property-image a::attr(href)').get()
yield scrapy.Request(
link,
callback=self.get_loc,
meta={'item': {
'url': link,
}},
)
def get_loc(self, response):
item = response.meta.get('item')
pics_link = response.css('div.gallery img::attr(src)').getall()
item['images'] = pics_link
return item
--------------------------------------------------------------------
class CapeWaterfrontSpider(WebBox2Spider):
name = "cape_waterfront_estates"
start_urls = ['https://www.capewaterfrontestates.co.za/template/Properties.vm/listingtype/SALES',
'https://www.capewaterfrontestates.co.za/template/Properties.vm/listingtype/MONTHLY_RENTAL']
Upvotes: 0
Views: 74
Reputation: 2116
You can check using scrapy shell
what the html looks like for scrapy. The content you're trying to get is loaded dynamically, so you'll have to adapt your selector to, for example: pics_link = response.xpath('//*[@data-nav="thumbs"]//@data-full').extract()
Upvotes: 1