Reputation: 5578
I have to put a caching service in front of a very slow API. The api responses in around 40 seconds which is way too long for my users. I want to cache responses using something like Varnish.
Here is my problem:
When the cache service gets hit for the first time, it will take around 40 seconds to get a response from the upstream API. All the consecutive requests will be served directly from cache. When the cache TTL has expired, the cache service has to hit the slow API again and wait for 40 seconds which is not acceptable. Is there a way to avoid it by doing some sort of async background cache updates? If so, can Varnish do it?
For simplicity, let's assume that all the client requests are the same.
The situation is actually much much worse if the caching service is being hit hundreds of times per second. 40 seconds long wait to refresh the cache will queue up thousands of client requests which may cause different sort of problems including dropping connections. I assume that Varnish is smart enough and it would only fire the upstream API call once and queue the other requests until it gets a response.
Does Varnish or any alternative keep Last Known Good Copy? What if my slow API has gone down and is not responding at all, is it possible to serve the LKGC from cache even after it has expired?
What is the best software to achieve all this?
Upvotes: 0
Views: 425
Reputation: 9855
Varnish absolutely supports this through grace mode. You will need a bit of VCL code magic to enable grace mode. Here is what it is:
Thus, only the initial request will be slow. Clients will get refreshed requests without any delays.
Basically, you want to set "healthy backend grace period" to the maximum time that your backend will take to generate response. Example:
sub vcl_hit {
if (obj.ttl >= 0s) {
# normal hit
return (deliver);
}
# We have no fresh fish. Lets look at the stale ones.
if (std.healthy(req.backend_hint)) {
# Backend is healthy. Limit age to 45s.
if (obj.ttl + 45s > 0s) {
set req.http.grace = "normal(limited)";
return (deliver);
} else {
# No candidate for grace. Fetch a fresh object.
return(fetch);
}
} else {
# backend is sick - use full grace
if (obj.ttl + obj.grace > 0s) {
set req.http.grace = "full";
return (deliver);
} else {
# no graced object.
return (fetch);
}
}
}
In this example, I've set it to 45 seconds. This makes sure to cover the 40 seconds it might take for the slow API request.
Upvotes: 1