Kurt Peek
Kurt Peek

Reputation: 57651

How to yield a Scrapy Request to another spider with different settings?

This question is essentially the same as Pass scraped URL's from one spider to another, but I'd like to double-check whether there is no 'Scrapy-native' way to do this.

I'm scraping web pages which 99% of the time can be scraped successfully without rendering JavaScript. Sometimes, however, this fails and certain Fields are not present. I'd like to write a Scrapy Extension with an item_scraped method which checks if all expected fields are populated and if not, yield a SplashRequest to a different spider with custom_settings including the Splash settings (cf. https://blog.scrapinghub.com/2015/03/02/handling-javascript-in-scrapy-with-splash/).

Is there any Scrapy way to do this without using an external service (like Redis)?

Upvotes: 1

Views: 394

Answers (1)

Mikhail Korobov
Mikhail Korobov

Reputation: 22238

Enabling scrapy-splash only makes SplashRequest work, it does not affect regular scrapy.Request (if there is no 'splash' in request.meta).

You can include Splash settings and still yield scrapy.Request - they will be processed without Splash.

Upvotes: 4

Related Questions