Reputation: 3518
I'm using the Requests-Cache library to cache results from Requests. It appears to install a cache just fine; requesting a URL creates a .sqlite
cache file, and subsequent requests retrieve that data, even if the remote page changes.
My internet connection is rather poor today, and I noticed my script (which makes many (supposedly cached) requests) was running slowly. As a quick sanity check, I tried a test script to make a cache, then ran it again after disconnecting my computer from wifi. However, this errors out:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='magicplugin.normalitycomics.com', port=80): Max retries exceeded with url: /update/updatelist.txt (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x110390d68>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
Why is the request even trying to connect to the remote site, if Requests-Cache is redirecting it to use the local cached data? Is there a way to avoid this? I don't need to slow down my script (particularly if my connection is poor) and make unnecessary requests from the server.
Upvotes: 5
Views: 1515
Reputation: 3518
I figured it out!
My actual code makes requests that sometimes successfully get pages, and sometimes get a 404.
The only reason my simple test script replicated the problem was that I made a typo in the page I was requesting. Requests received a 404. Even though Requests-Cache created a cache file, it did not store this result in it.
It turns out that by default, Requests-Cache only caches 200-code responses, but this is configurable:
requests_cache.install_cache('example_cache', allowable_codes=(200, 404))
And now it works fine!
Upvotes: 5