Cant get text from span scrapy python

Question

So I'm making a bot to get price and name from Zara products and I managed to get the product name but the price it returning [].

Here is my code:

#!/usr/bin/python3
#-*- coding: utf-8 -*-

import scrapy

class Zara(scrapy.Spider):
    name = 'Zara'

def start_requests(self, url='https://www.zara.com/pt/pt/casaco-l%C3%A3-quadrados-p02092540.html?v1=42984974&v2=1445646'):
    yield scrapy.Request(url=url, callback=self.parse)

def parse(self, response):
    try:
        name = response.xpath('//*[@id="product"]/div[1]/div/div[2]/header/h1/text()').get()
        price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span/text()').get()
    except:
        print('Fail')

    print(name)
    print(price)

What it returns:

CASACO LÃ QUADRADOS
[]

What its supposed to return:

CASACO LÃ QUADRADOS
149,00 EUR

Everything I tried:

price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span/text()').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span[@class="main-price"]').get()
price = response.xpath('//*[@id="product"]/div[1]/div/div[2]/div[1]/span[@class="main-price"]/text()').get()

I think that's all I tried! I'm using scrapy version 1.8 with python 3.7

Janib Soomro · Accepted Answer

The reason why you don't get price using normal 'xpath/css' approach is that, the 'price' field isn't available to your crawler directly. Your crawler see pages differently hence the xpath(s) are completely different.

Try this approach:

from re import search

_script = response.xpath("//script[contains(text(),'price')][1]")[0].extract()
price = search ( r",.price.:(\d+)", _script ).group(1)

Moreover, it's better to use a different try...except for individual fields, so that you know which section exactly produced the error, for further rectification.

Cant get text from span scrapy python

Answers (1)

Related Questions