Reputation: 9665
I know that the default Varnish vcl_fetch
looks at beresp.ttl
and beresp.http.*
to reference the HTTP headers returned from the backend, but is it possible to examine the content of the response also? Our backend sometimes fails with junk HTML but with a status of 200 OK. We'd like to be able to run a regex on the result and retry if possible.
I understand that versions of Varnish <= 3.0 don't stream anyway and download the entire object before passing to the client, but I can't find the appropriate field in beresp
in the documentation - I'm looking for something like beresp.http.content
Upvotes: 2
Views: 607
Reputation: 31
Yes and no. It's accessible, but only through inline C, not VCL configuration (to the best of my knowledge). However, it's not easy to do and not really recommended due to the additional overhead of parsing body text. That said, you can see an attempt at something like what you're looking for here: rewrite vmod for varnish 3
If your junk HTML responses are of a specific length, you can retry the request based on the response's Content-Length header. Alternatively, you might consider adding client-side JS to evaluate the HTML and make an AJAX request to a URL to clear the cache of any junk pages. Lastly, if you know that only a specific subset of your site that returns invalid results, you can try proxying those URLs through something like OpenResty with LuaJIT or nginx with the subs module enabled, and do the body parsing there.
Upvotes: 2