retrieve scraped items from scrapy crawl when triggered via CrawlerRunner

Question

I have 2 spiders in a scrapy project. They work just fine and produce the required output items.

I want to execute these spiders in a background job in a web application.

Everything is setup - a Flask app with a background job setup using Redis - frontend waits for results - all is well.

Except i can't seem to work out how to get the resulting items from the spiders when they execute.

The closest i've come seems to be the answer to this question

Get Scrapy crawler output/results in script file function

but it seems to refer to an older version of scrapy (i'm using 1.4.0) and i get the deprecation warning

'ScrapyDeprecationWarning: Importing from scrapy.xlib.pydispatch is deprecated and will no longer be supported in future Scrapy versions. If you just want to connect signals use the from_crawler class method, otherwise import pydispatch directly if needed. See: https://github.com/scrapy/scrapy/issues/1762'

checking that github issue suggests this wouldn't have worked from around v1.1.0

So, can anyone tell me how to do this now?

retrieve scraped items from scrapy crawl when triggered via CrawlerRunner

Answers (1)

Related Questions