shubham
shubham

Reputation: 479

How to receive the response of scrapy.Request() in the same function it is called from?

I am quite new to Scrapy and have this requirement where I need the response of the Scrapy Request back to the function it is getting called from. Currently I found only 1 solution for this by using scrapy-inline-requests library

Is there any native way for this in Scrapy?

For Example

def parse(self, response):
    item = spiderItem()

    # Extract some items here from this response using CSS Selectors
    # ....
    # ....

    # Now extract URL from the response
    new_url = response.css("div.urls::text").get()
    yield scrapy.Request(new_url, callback=self.parse_more)

    # Receive the response from parse_more() here. Is it possible?
    resp = 

def parse_more(self, response):
    # This function should be able to return the response back to the parse() function for further processing.

Something like what we are able to do in requests Library

response = requests.get(url)

Upvotes: 0

Views: 320

Answers (2)

yeqiuuu
yeqiuuu

Reputation: 147

Yes, you can do this by defining your parse callback as a coroutine. See here.

For example:

from scrapy.utils.defer import maybe_deferred_to_future


async def parse(self, response):
    item = spiderItem()

    # Extract some items here from this response using CSS Selectors
    # ....
    # ....

    # Now extract URL from the response
    new_url = response.css("div.urls::text").get()
    req = scrapy.Request(new_url, callback=self.parse_more)
    # Receive the response from parse_more() here. Is it possible?
    deferred = self.crawler.engine.download(req)
    resp = await maybe_deferred_to_future(deferred)

Upvotes: 0

Krisz
Krisz

Reputation: 2264

Since Scrapy was built on top of the Twisted async library, don't think it's possible. The callback method gets invoked with the HTTP response without blocking the calling thread.

Upvotes: 0

Related Questions