Reputation: 888
So I'm making a bot to get price and name from Zara products and I managed to get the product name but the price it returning []
.
Here is my code:
#!/usr/bin/python3
#-*- coding: utf-8 -*-
import scrapy
class Zara(scrapy.Spider):
name = 'Zara'
def start_requests(self, url='https://www.zara.com/pt/pt/casaco-l%C3%A3-quadrados-p02092540.html?v1=42984974&v2=1445646'):
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
try:
name = response.xpath('//*[@id="product"]/div[1]/div/div[2]/header/h1/text()').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span/text()').get()
except:
print('Fail')
print(name)
print(price)
What it returns:
CASACO LÃ QUADRADOS
[]
What its supposed to return:
CASACO LÃ QUADRADOS
149,00 EUR
Everything I tried:
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span/text()').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span[@class="main-price"]').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span[@class="main-price"]/text()').get()
I think that's all I tried! I'm using scrapy version 1.8 with python 3.7
Upvotes: 0
Views: 473
Reputation: 632
The reason why you don't get price using normal 'xpath/css' approach is that, the 'price' field isn't available to your crawler directly. Your crawler see pages differently hence the xpath(s) are completely different.
Try this approach:
from re import search
_script = response.xpath("//script[contains(text(),'price')][1]")[0].extract()
price = search ( r",.price.:(\d+)", _script ).group(1)
Moreover, it's better to use a different try...except for individual fields, so that you know which section exactly produced the error, for further rectification.
Upvotes: 2