Reputation: 739
I am using scrapy for web scraping, but getting data loss warning for few requests, each time I run the same spider, it gives me these data loss error on different urls, so I believe that it just needs to retry for these requests, does anyone know, how can I do that? I am getting following warning few times:
[scrapy.core.downloader.handlers.http11] WARNING: Got data loss in <failed link> If you want to process broken responses set the setting DOWNLOAD_FAIL_ON_DATALOSS = False -- This message won't be shown in further requests
Upvotes: 1
Views: 1173
Reputation: 361
Just as the error message says you will need to configure Scrapy to handle failed downloads. The reference for configuring Scrapy is a great resource to do so depending on how you decide to run or configure your program.
https://docs.scrapy.org/en/latest/topics/settings.html
As long as the servers are not misconfigured and these are temporary issues you can set the RETRY_ENABLED
flag to True
and the DOWNLOAD_FAIL_ON_DATALOSS
flag to False in order to retry failed scrapes.
Upvotes: 1