Reputation: 1575
I find several solutions on how to extract data in a specific order by overriding class OrderItem
class OrderedItem(scrapy.Item):
def __init__(self, *args, **kwargs):
self._values = OrderedDict()
if args or kwargs:
for k, v in six.iteritems(dict(*args, **kwargs)):
self[k] = v
I have more data that is being extracted and every time the order is different def repr(self): return json.dumps(OrderedDict(self), ensure_ascii = False)
class NewItem(OrderedItem):
title = scrapy.Field()
price = scrapy.Field()
Then inside crawler script, i defined an instance of NewItem
object
def parse(self, response):
items = NewItem()
items['title'] = response.xpath(
"//span[@class='pdp-mod-product-badge-title'/text()").extract_first()
items['price'] = response.xpath("//span[contains(@class, 'pdp-price')]/text()").extract_first()
yield items
Upvotes: 0
Views: 133
Reputation: 10666
You need to define your order in settings.py
:
FEED_EXPORT_FIELDS = ["title", "price"]
Upvotes: 1