jsjc
jsjc

Reputation: 1023

Scrapy if Request error then return item

I got stuck trying to work out a solution...

My Scrapy spider crawls a site and gets some data into item then returns a request based on the data scraped and goes to crawl some other site in order to complete the item.

What it happens is that sometimes the second URL can return errors so item does not get outputted and as well.

How can I carry the item to the errback function?

Thanks in advance.

Upvotes: 1

Views: 1926

Answers (1)

warvariuc
warvariuc

Reputation: 59574

From the docs:

errback (callable) – a function that will be called if any exception was raised while processing the request. This includes pages that failed with 404 HTTP errors and such. It receives a Twisted Failure instance as first parameter.

Try to use lambda:

    ...
    yield Request(..., errback=lambda failure, item=item: self.on_error(failure, item))

def on_error(self, failure, item):
    ...

Upvotes: 4

Related Questions