Reputation: 2064
I'm trying to use the serializer attribute in an Item, just like the example in the documentation: https://docs.scrapy.org/en/latest/topics/exporters.html#declaring-a-serializer-in-the-field
The spider works without any errors, but the serialization doesn't happens, the print in the function doesn't print too. It's like the function remove_pound is never called.
import scrapy
def remove_pound(value):
print('Am I a joke to you?')
return value.replace('£', '')
class BookItem(scrapy.Item):
title = scrapy.Field()
price = scrapy.Field(serializer=remove_pound)
class BookSpider(scrapy.Spider):
name = 'bookspider'
start_urls = ['https://books.toscrape.com/']
def parse(self, response):
books = response.xpath('//ol/li')
for i in books:
yield BookItem(
title=i.xpath('article/h3/a/text()').get(),
price=i.xpath('article/div/p[@class="price_color"]/text()').get(),
)
Am I using it wrong?
PS.: I know there are other ways to do it, I just want to learn to use this way.
Upvotes: 0
Views: 107
Reputation: 10666
The only reason it doesn't work is because your XPath expression is not right. You need to use relative XPath:
price=i.xpath('./article/div/p[@class="price_color"]/text()').get()
Update It's not XPath. The serialization works only for item exporters:
you can customize how each field value is serialized before it is passed to the serialization library.
So if you run this command scrapy crawl bookspider -o BookSpider.csv
you'll get a correct (serialized) output.
Upvotes: 1