Vitaliy_P
Vitaliy_P

Reputation: 21

How to pass arguments (for FEED_URI) to Scrapy spider's instane for dynamically naming output file

I want to send arguments to spider and get output (json, csv) named accordingly to arguments.
F.e.,
$ scrapy crawl spider_name -a category=category1 -a subcategory=subcategory1

and I want to get:
category1_subcategory1.json (or csv, it doesn't matter).
I mean I need exactly json name as arguments for spider.

class MySpider(scrapy.Spider):

name = 'my_spider'
# how can I get to this place ?
customs_settings = {
  'FEED_URI' : 'some_name.json'
 }
def __init__(self, category, subcategory, *args, **kwargs):

    super(MySpider, self).__init__(*args, **kwargs)

    self.category = category
    self.subcategory = subcategory

Upvotes: 2

Views: 4828

Answers (1)

mizhgun
mizhgun

Reputation: 1887

You can get those parameters from kwargs of __init__ and use in FEED_URI like this:

class MySpider(scrapy.Spider):
    name = 'my_spider'

    custom_settings = {
      'FEED_URI' : '%(category)s_%(subcategory)s.json'
     }

    def __init__(self, *args, **kwargs):
        self.category = kwargs.pop('category', '')
        self.subcategory = kwargs.pop('subcategory', '')
        super(MySpider, self).__init__(*args, **kwargs)
        

Docs: https://doc.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters

Upvotes: 6

Related Questions