In requests_cache, does the sqlite file contain the cached request time and age?

Question

Consider the following code:

import requests
import requests_cache

requests_cache.install_cache(expire_after=7200)

url = 'http://www.example.com'
with requests.Session() as sess:
    response = sess.get(url)
    print response.text

First run

When I first run this code, I am sure that the GET request is sent out to www.example.com, since no cache has been set up yet. I will then see a file named cache.sqlite in the working directory, which contains the request being cached inside it.

The first process will then exit, erasing all traces of it from RAM.

Second run, maybe 2000 seconds later

What else does requests_cache.install_cache do? Aside from "installing" a cache, does it also tell the present Python session that "Hey, there's a cache present right now, you might want to look into it before sending out new requests".

So, my question is, does the new instance of my script process respect the existing cache.sqlite or does it create an entirely new one from scratch?

If not, how do I make sure that it will look up the existing cache first before sending out new requests, and also consider the age of the cached requests?

Jordan · Accepted Answer

Here's what's going on under the hood:

requests_cache.install_cache() globally patches out requests.Session with caching behavior.
install_cache() takes a number of optional arguments to tell it where and how to cache responses, but by default it will create a SQLite database in the current directory, as you noticed.
A cached response will be stored along with its expiration time, in response.expires
The next time you run your script, install_cache() will load the existing database instead of making a new one
The next time you make that request, the expiration time will be checked against the current time. If it's expired, a new request will be sent and the old cached response will be overwritten with the new one.

Here's an example that makes it more obvious what's going on:

from requests_cache import CachedSession

session = CachedSession('~/my_project/requests_cache.db', expire_after=7200)
session.get('http://www.example.com')
session.get('http://www.example.com')

# Show if the response came from the cache, when it was created, and when it expires
print(f'From cache: {response.from_cache}')
print(f'Created: {response.created_at}')
print(f'Expires: {response.expires}')

# You can also get a summary from just printing the cached response object
print(response)

# Show where the cache is stored, and currently cached request URLs
print(session.cache.db_path)
for url in session.cache.urls:
    print(url)

And for reference, there is now more thorough user documentation that should answer most questions about how requests-cache works and how to make it behave the way you want: https://requests-cache.readthedocs.io

In requests_cache, does the sqlite file contain the cached request time and age?

First run

The first process will then exit, erasing all traces of it from RAM.

Second run, maybe 2000 seconds later

Answers (1)

Related Questions