Justcurious
Justcurious

Reputation: 2064

Item serializers don't work. Function never gets called

I'm trying to use the serializer attribute in an Item, just like the example in the documentation: https://docs.scrapy.org/en/latest/topics/exporters.html#declaring-a-serializer-in-the-field

The spider works without any errors, but the serialization doesn't happens, the print in the function doesn't print too. It's like the function remove_pound is never called.

import scrapy

def remove_pound(value):
    print('Am I a joke to you?')
    return value.replace('£', '')

class BookItem(scrapy.Item):
    title = scrapy.Field()
    price = scrapy.Field(serializer=remove_pound)

class BookSpider(scrapy.Spider):
    name = 'bookspider'
    start_urls = ['https://books.toscrape.com/']

    def parse(self, response):
        books = response.xpath('//ol/li')
        for i in books:
            yield BookItem(
                title=i.xpath('article/h3/a/text()').get(),
                price=i.xpath('article/div/p[@class="price_color"]/text()').get(),
            )

Am I using it wrong?

PS.: I know there are other ways to do it, I just want to learn to use this way.

Upvotes: 0

Views: 107

Answers (1)

gangabass
gangabass

Reputation: 10666

The only reason it doesn't work is because your XPath expression is not right. You need to use relative XPath:

price=i.xpath('./article/div/p[@class="price_color"]/text()').get()

Update It's not XPath. The serialization works only for item exporters:

you can customize how each field value is serialized before it is passed to the serialization library.

So if you run this command scrapy crawl bookspider -o BookSpider.csv you'll get a correct (serialized) output.

Upvotes: 1

Related Questions