Hatshepsut
Hatshepsut

Reputation: 6662

Scrapy middleware to replace single request with multiple requests

I want a middleware that will take a single Request and transform it into a generator of two different requests. As far as I can tell, the downloader middleware process_request() method can only return a single Request, not a generator of them. Is there a nice way to split an arbitrary request into multiple requests?

It seems that spider middleware process_start_requests actually happens after the start_requests Requests are sent through the downloader. For example, if I set start_urls = ['https://localhost/'] and

def process_start_requests(self, start_requests, spider):
   yield Request('https://stackoverflow.com')

it will fail with ConnectionRefusedError, having tried and failed the localhost request.

Upvotes: 2

Views: 1373

Answers (1)

eLRuLL
eLRuLL

Reputation: 18799

I don't know what would be the logic behind transforming a request (before being sent) into multiple requests, but you can still generate several requests (or even items) from a middleware, with this:

def process_request(self, request, spider):
    for a in range(10):
        spider.crawler.engine.crawl(
            Request(url='myurl', callback=callback_method), 
            spider)

Upvotes: 5

Related Questions