Scrapy - keep spider open indefinitely

Question

I'm planning to have daemon CrawlWorker (subclassing multiprocessing.Process) that monitors a queue for scrape requests.

The responsibility of this worker is to take scrape requests from the queue and feed them to spiders. In order to avoid implementing batching logic (like wait for N requests before creating a new spider), would it make sense to keep all my spiders alive, and then add more scrape requests to each spider when they're idle, and if there are no more scrape requests, keep them open?

What would be the best, simplest, and most elegant way to implement this? It seems that given that attributes start_urls, that a spider is meant to be instantiated with an initial work list, do its work, then die.

I'm thinking of listening to spider_closed, but is there an exception I can raise to keep it open?

Scrapy - keep spider open indefinitely

Answers (1)

Related Questions