GFix
GFix

Reputation: 31

Scrapy Dynamic Item Class Creation

Unsure of how to create a dynamic item class: http://scrapy.readthedocs.org/en/latest/topics/practices.html#dynamic-creation-of-item-classes

Not quite sure where I would use that code provided in the documentation. Would I stick this in pipelines.py, items.py and call this from the parse function of the spider? or the main script file that calls the scrapy spider?

Upvotes: 0

Views: 1029

Answers (1)

Anzel
Anzel

Reputation: 20553

I would place the code snippet in items.py, and use it in the spider for any dynamic item I need (probably down to personal preferences), for example:

from myproject.items import create_item_class

# base on one of the scrapy example...    
class MySpider(CrawlSpider):
    # ... name, allowed_domains ... 
    def parse_item(self, response):
        self.log('Hi, this is an item page! %s' % response.url)
        # for need to use a dynamic item
        field_list = ['id', 'name', 'description']
        DynamicItem = create_item_class('DynamicItem', field_list)
        item = DynamicItem()
        # then you can use it here...
        item['id'] = response.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)')
        item['name'] = response.xpath('//td[@id="item_name"]/text()').extract()
        item['description'] = response.xpath('//td[@id="item_description"]/text()').extract()
        return item

You may be interested to read the Dynamic Creation of Item Classes #398 for a better understanding.

Upvotes: 1

Related Questions