WildCat
WildCat

Reputation: 2031

Scrapy 1.0: How to run crawler in Celery?

I tried the sample from document http://doc.scrapy.org/en/stable/topics/practices.html, but there would be an error ReactorNotRestartable while running it a second time.

settings = get_project_settings()

runner = CrawlerRunner(settings=settings)

@defer.inlineCallbacks
def crawl():
    yield runner.crawl(LatestNewsSpider)
    reactor.stop()

def run_spider():
    crawl()
    reactor.run()

Upvotes: 3

Views: 1442

Answers (1)

Artur Gaspar
Artur Gaspar

Reputation: 4552

Set CELERYD_MAX_TASKS_PER_CHILD to 1 in your Celery settings. This will run only one task in each child process, so that it will never start the reactor more than once in the same process.

Alternatively, you could run the reactor in a thread and never stop it. I have no idea if it would work. crochet might be of use.

Upvotes: 1

Related Questions