Reputation: 31
Unsure of how to create a dynamic item class: http://scrapy.readthedocs.org/en/latest/topics/practices.html#dynamic-creation-of-item-classes
Not quite sure where I would use that code provided in the documentation. Would I stick this in pipelines.py, items.py and call this from the parse function of the spider? or the main script file that calls the scrapy spider?
Upvotes: 0
Views: 1029
Reputation: 20553
I would place the code snippet in items.py
, and use it in the spider
for any dynamic item I need (probably down to personal preferences), for example:
from myproject.items import create_item_class
# base on one of the scrapy example...
class MySpider(CrawlSpider):
# ... name, allowed_domains ...
def parse_item(self, response):
self.log('Hi, this is an item page! %s' % response.url)
# for need to use a dynamic item
field_list = ['id', 'name', 'description']
DynamicItem = create_item_class('DynamicItem', field_list)
item = DynamicItem()
# then you can use it here...
item['id'] = response.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)')
item['name'] = response.xpath('//td[@id="item_name"]/text()').extract()
item['description'] = response.xpath('//td[@id="item_description"]/text()').extract()
return item
You may be interested to read the Dynamic Creation of Item Classes #398 for a better understanding.
Upvotes: 1