Vishnubly
Vishnubly

Reputation: 739

Retry in case of Data Loss in Scrapy

I am using scrapy for web scraping, but getting data loss warning for few requests, each time I run the same spider, it gives me these data loss error on different urls, so I believe that it just needs to retry for these requests, does anyone know, how can I do that? I am getting following warning few times:

 [scrapy.core.downloader.handlers.http11] WARNING: Got data loss in <failed link>  If you want to process broken responses set the setting DOWNLOAD_FAIL_ON_DATALOSS = False -- This message won't be shown in further requests

Upvotes: 1

Views: 1173

Answers (1)

Poiuy
Poiuy

Reputation: 361

Just as the error message says you will need to configure Scrapy to handle failed downloads. The reference for configuring Scrapy is a great resource to do so depending on how you decide to run or configure your program.

https://docs.scrapy.org/en/latest/topics/settings.html

As long as the servers are not misconfigured and these are temporary issues you can set the RETRY_ENABLED flag to True and the DOWNLOAD_FAIL_ON_DATALOSS flag to False in order to retry failed scrapes.

Upvotes: 1

Related Questions