Reputation: 479
I am quite new to Scrapy and have this requirement where I need the response of the Scrapy Request back to the function it is getting called from. Currently I found only 1 solution for this by using scrapy-inline-requests library
Is there any native way for this in Scrapy?
For Example
def parse(self, response):
item = spiderItem()
# Extract some items here from this response using CSS Selectors
# ....
# ....
# Now extract URL from the response
new_url = response.css("div.urls::text").get()
yield scrapy.Request(new_url, callback=self.parse_more)
# Receive the response from parse_more() here. Is it possible?
resp =
def parse_more(self, response):
# This function should be able to return the response back to the parse() function for further processing.
Something like what we are able to do in requests Library
response = requests.get(url)
Upvotes: 0
Views: 320
Reputation: 147
Yes, you can do this by defining your parse callback as a coroutine. See here.
For example:
from scrapy.utils.defer import maybe_deferred_to_future
async def parse(self, response):
item = spiderItem()
# Extract some items here from this response using CSS Selectors
# ....
# ....
# Now extract URL from the response
new_url = response.css("div.urls::text").get()
req = scrapy.Request(new_url, callback=self.parse_more)
# Receive the response from parse_more() here. Is it possible?
deferred = self.crawler.engine.download(req)
resp = await maybe_deferred_to_future(deferred)
Upvotes: 0
Reputation: 2264
Since Scrapy was built on top of the Twisted async library, don't think it's possible. The callback method gets invoked with the HTTP response without blocking the calling thread.
Upvotes: 0