Yohan Obadia
Yohan Obadia

Reputation: 2672

Scrapy: Pass argument to pipeline from CrawlerProcess

I have a CrawlerProcess that launch the spider I want but I would like it to also pass the parameter freq to the pipeline.

process = CrawlerProcess(get_project_settings())
process.crawl(spider, freq=freq)
process.start()

I know that the way to get a parameter should be to use:

@classmethod
def from_crawler(cls, crawler):

But I have no idea how to get the freq parameter from there. Any idea ?

Upvotes: 1

Views: 1001

Answers (1)

Yohan Obadia
Yohan Obadia

Reputation: 2672

Took me some time to figure it out but everything was actually in the Core API description of the method.

This solution is probably not the optimal one since I get the freq paramter from spider but it might be possible to grab it from the crawler directly if anyone has a better solution.

So the pipeline looks like:

class Pipeline(object):

    def __init__(self, freq):
        self.freq = freq

    @classmethod
    def from_crawler(cls, crawler):
        return cls(freq=crawler.spider.data_test)

    def open_spider(self, spider):
        return

    def process_item(self, item, spider):
        print("Freq:{}\n".format(self.freq))

    def close_spider(self, spider):
        return

What you have to do is encapsulate the variables you want to pass to pipeline in cls, give them a name, and in the __init__store them as class attribute. To be able to grab it from the spider, I had to store it also has an attribute in the spider:

class TestSpider(scrapy.Spider):
    name = "test"

    def __init__(self, freq):
        self.freq = freq

If you have some improvements on this solution, feel free to comment or offer a better one. I know it is not optimal.

Upvotes: 4

Related Questions