Nic Cottrell
Nic Cottrell

Reputation: 9665

Is it possible for Varnish to examine the content of a request (not just headers) in vcl_fetch and react?

I know that the default Varnish vcl_fetch looks at beresp.ttl and beresp.http.* to reference the HTTP headers returned from the backend, but is it possible to examine the content of the response also? Our backend sometimes fails with junk HTML but with a status of 200 OK. We'd like to be able to run a regex on the result and retry if possible.

I understand that versions of Varnish <= 3.0 don't stream anyway and download the entire object before passing to the client, but I can't find the appropriate field in beresp in the documentation - I'm looking for something like beresp.http.content

Upvotes: 2

Views: 607

Answers (1)

lexnihilo
lexnihilo

Reputation: 31

Yes and no. It's accessible, but only through inline C, not VCL configuration (to the best of my knowledge). However, it's not easy to do and not really recommended due to the additional overhead of parsing body text. That said, you can see an attempt at something like what you're looking for here: rewrite vmod for varnish 3

If your junk HTML responses are of a specific length, you can retry the request based on the response's Content-Length header. Alternatively, you might consider adding client-side JS to evaluate the HTML and make an AJAX request to a URL to clear the cache of any junk pages. Lastly, if you know that only a specific subset of your site that returns invalid results, you can try proxying those URLs through something like OpenResty with LuaJIT or nginx with the subs module enabled, and do the body parsing there.

Upvotes: 2

Related Questions