Recently I added a Varnish instance to a Rails application stack. Varnish in it's default configuration can be convinced from caching a certain resource using the Cache-Control Header like so:
Cache-Control: max-age=86400, public=true
I achieved that one using the expires_in statement in my controllers:
def index
expires_in 24.hours, public: true
respond_with 'some content'
end
That worked well. What I did not expect is, that the Cache-Control header ALSO affects the browser. That leads to the problem that both - Varnish and my users browser cache a certain resource. The resource is purged from varnish correctly, but the browser does not attempts to request it again unless max-age is reached.
So I wonder wether I should use 'expires_in' in combination with Varnish at all? I could filter the Cache-Control header in a Nginx or Apache instance in front of Varnish, but that seems odd.
Can anyone enlighten me?
Regards Felix
That is actually a very good and valid question, and a very common one with reverse proxies.
The problem is that there's only one Cache-Control property and it is intended for the client browser (private cache) and/or a proxy server (shared cache). If you don't want 3rd party proxies to cache your content at all, and want every request to be served by your Varnish (or by your Rails backend), you must send appropriate Cache-Control header from Varnish.
Modifying Cache-Control header sent by the backend is discussed in detail at https://www.varnish-cache.org/trac/wiki/VCLExampleLongerCaching
You can approach the solution from two different angles. If you wish to define max-age at your Rails backend, for instance to specify different TTL for different objects, you can use the method described in the link above.
Another solution is to not send Cache-Control headers at all from the backend, and instead define desirable TTLs for objects in varnish vcl_fetch(). This is the approach we have taken.
We have a default TTL of 600 seconds in Varnish, and define longer TTLs for pages that are definitely explicitly purged when changes are made. Here's our current vcl_fetch() definition:
sub vcl_fetch {
if (req.http.Host ~ "(forum|discus)") {
# Forum pages are purged explicitly, so cache them for 48h
set beresp.ttl = 48h;
}
if (req.url ~ "^/software/") {
# Software pages are purged explicitly, so cache them for 48h
set beresp.ttl = 48h;
}
if (req.url ~ "^/search/forum_search_results" ) {
# We don't want forum search results to be cached for longer than 5 minutes
set beresp.ttl = 300s;
}
if(req.url == "/robots.txt") {
# Robots.txt is updated rarely and should be cached for 4 days
# Purge manually as required
set beresp.ttl = 96h;
}
if(beresp.status == 404) {
# Cache 404 responses for 15 seconds
set beresp.http.Cache-Control = "max-age=15";
set beresp.ttl = 15s;
set beresp.grace = 15s;
}
}
In our case we don't send Cache-Control headers at all from the web backend servers.