Prefetching? At This Age?

#

A while back, I wrote about using Netlify and SpeedCurve to A/B test performance changes. The one I specifically mentioned was testing out Instant.Page on my site.

While the bulk of the post is about the A/B testing setup (which I am very happy with), I did note at the end that I was seeing some small improvements from Instant.Page, though the results were far from conclusive yet.

Alexandre, the creator of Instant.Page, suggested on Twitter that the gains I was seeing were small because Netlify passes an Age header that messes with prefetching.

Tim, Netlify sends a Age header that conflicts with prefetching. A prefetched page will get fetched again on navigation if its Age header is over 300. The small gain you are seeing are due to the navigation request being a 304 and not a 200.

This lead down an interesting little rabbit hole and, eventually, a bug. I learned a few new things as I dug in, so I figured it was worth sharing for others as well (and for me to come back to when I inevitably forget the details).

First, before we dive in, let’s zero in on the critical components of what’s happening on my site specifically.

For all HTML responses, I pass a Cache-control: max-age=900, must-revalidate header. This tells the browser to cache the response for 15 minutes (90060). After that, it has to revalidate—basically, talk to the server again to make sure the asset is still valid and a newer version isn’t available. As soon as the resources is revalidated, the 15 minutes starts over.

Netlify also passes along an ‘Age’ header, indicating how long they’ve been caching the resource themselves. (More on that in a bit) So, for example, if they’ve had the resource on the servers for 14 minutes, that would look like this:

age: 840

And finally, as a recap, Instant.Page works by using the prefetch resource hint to fetch links early, when someone hovers over the link instead of waiting for the next navigation to start.

Now let’s dive into each part of that and how they fit together.

The Age Header

The ‘Age’ header is used by upstream caching layers (Varnish, CDNs, other proxies, etc.) to indicate how long it’s been since a response was either generated or validated at the origin server. In other words, how long has that resource been sitting in that upstream cache.

It’s not just something that Netlify does—open just about any site and you’ll find resources with the ‘Age’ header set. That’s because if you’ve got something sitting between your origin and the browser caching your content, setting the ‘Age’ header is exactly what you’re supposed to be doing. It’s important information.

Let’s say you’re using a CDN to cache content on their edge servers instead of making visitors wait while assets are requested from wherever your origin server resides. The first time a resource is requested, the CDN is going to have to go out and make a connection to your origin server to get it. At that point, since the CDN just got the resource from the origin server, the age is ‘0’.

Then, depending on what you’ve set up at the CDN level and assuming the CDN can cache the resource, the CDN will start serving that resource as it’s requested without talking to the origin again. As it does this, the age of the resource gets older and older.

Eventually, the CDN needs to talk to the origin server again.

Let’s say your CDN is set to cache a resource for 15 minutes before it needs to validate that the resource is still fresh. After 15 minutes, the CDN talks to the origin and will either get a new version of that resource or verification that the resource is still valid. At that point, the age of the resource resets to ‘0’—we’ve got a fresh start since we know what we have on the CDN is the latest version.

The browser’s primary mechanisms for determining what to cache and for how long are headers like Expires (which provides an expiration date for the resource being served), Cache-control (a ton of stuff here, but specifically for duration is max-age), Last-Modified (the date at which the resource was last modified), and Etag (a unique version identifier for the object). (For more detail on all of those,Paul Calvano’s post on Heuristic Caching and Harry’s post about Cache-Control are both top-notch resources.)

Age, too, factors in.

Let’s say that your CDN is set to cache a resource for 15 minutes, and you’ve also told the browser to cache that resource for 15 minutes using the Cache-control header (Cache-control: max-age=900, must-revalidate). With two layers of caching, each at 15 minutes, that means we have a potential Time to Live (the time a resource is stored in a cache before it’s deleted or updated) of up to 30 minutes—if the browser requests the resource just before the CDN’s version expires, then it’s been sitting in cache for 15 minutes on the CDN and will sit in the browser cache for another 15 minutes—so 30 minutes total.

For any sort of remotely dynamic content, this could be problematic. If the content changes in that upstream cache, we could still be serving an old version of the resource for 15 more minutes until the browser cache expires.

The Age header helps to prevent against this.

Let’s go back to our example, where the browser requests the resource just before the CDN’s version expires. Only this time, let’s say the CDN communicates how long it’s had the asset in cache by providing an Age header of 840 (14 minutes). The browser knows from the max-age directive that it’s ok to serve an asset that is 15 minutes old, and it knows that the asset has been sitting on the CDN for 14 minutes. So, the browser adjusts the TTL to 1 minute (15 minutes of browser TTL minus 14 minutes it’s already been on the CDN), protecting against this problem of cache layers stacking on top of each other.

This can all get a bit funky if the max-age directive you’re passing to the browser doesn’t align with how long you’re caching the resource upstream.

For example, if you’re telling your CDN to cache a file for a week, but you’re only telling the browser to cache that resource for 15 minutes, then as soon as the Age of that resource exceeds 900 (15*60) the browser will no longer consider that resource safe to cache. Everytime it sees the request, it will note that the age is past the maximum TTL it’s been told to pay attention to, so it goes back out to the servers to try to find a new version.

There are times where having mismatched TTL's at a caching layer and at the browser may make sense. It's pretty quick to purge the cache (basically, empty it out) for most CDNs. So sometimes what you'll see is folks set a long TTL at the CDN layer and a short one at the browser level. Then, if the content does need to change, they can purge the CDN cache quickly and all they have to wait for is the browser to get past whatever short TTL they've set there. In those cases, it makes sense from a performance standpoint not to pass the Age header so that the browser can keep caching.

How prefetch works

When you use the prefetch resource hint (which is what Instant.Page does), you’re telling the browser to go grab that resource even though it hasn’t been requested by the current page, and put it into cache.

So, for example, the following example tells the browser to grab the about page and store it.

<link rel="prefetch" href="/about" as="document" />

The browser will request the resource at a very low priority during idle time so that the resource doesn’t compete with anything needed for the current navigation.

As with any request that gets cached, how long it’s cached depends on the caching headers. But with prefetch, there’s an added wrinkle.

The entire point of prefetch is to have something stored for the next navigation: making a prefetch for a resource that expires before that next navigation is wasted work and wasted bytes.

For this reasons, Chromium-based browsers have a period of five minutes where they’ll cache any prefetched resources regardless of any other caching indicators (unless no-store has explicitly been set in the Cache-control header). After that window has expired, the normal Cache-control directives kick in, minus that initial window.

In my case, I serve HTML documents with a max-age of 15 minutes. That means Chrome will save that prefetched resource for 15 minutes so this 5 minute window doesn’t really do anything special.

But if you served an asset with a max-age of 0, then Chrome is still going to hold that resource for 5 minutes before having to revalidate it. The main takeaway here is that to avoid wasted work, the browser ignores the usual indicators of freshness for a period of time.

Firefox, on the other hand, does not have this little extra window for prefetched resources—it treats them like any other cached object, paying attention to the caching headers as normal. So, if (for example) the max-age is 0 for a prefetched resource, Firefox will make the request as directed using prefetch and then make the request again once it discovers it on the next navigation.

Bringing it altogether

Phew. Ok. So we know what the Age header does, we know how the browser uses it to determine caching, and we know that Chromium-based browsers ignore all the usual freshness indicators when it comes to prefetch, at least for a short period of time, and Firefox does not.

All of this means that in Firefox, if the Age exceeds the max-age directive, then the prefetched resource is going to result in two requests: once for the actual prefetch and, because the asset is older than the TTL, once again on the next navigation.

In Chromium-based browsers, it seems likeAge shouldn’t impact prefetch behavior at all—if Chromium ignores other caching directives, why is Age any different? It seems like a bug.

Which is exactly the conclusion Yoav came to:

To clarify, sounds like a Chromium bug. Sending Age headers for cached resources is what caches are supposed to do And indeed, the 5 minutes calculation includes the Age header, which IMO makes little sense https://source.chromium.org/chromium/chromium/src/+/master:net/http/http_cache_transaction.cc;l=2716;drc=2f11470d7ad8963a9add116df64d2edd1b85d3a4;bpv=1;bpt=1?originalUrl=https:%2F%2Fcs.chromium.org%2F

The bug is the source of what Alexandre was noting. Since Age is being included in the prefetch caching considerations, any prefetched resource in Chrome with an Age higher than either that 5 minute window or the max-age (whichever is longer) can’t be cached, so the request happens twice: once on prefetch and once on the next navigation.

In my specific case, while the bug’s behavior is definitely not ideal, it also doesn’t jump out in the metrics on the aggregate because of my service worker. When the request gets prefetched, the service worker caches it. On that next navigation, the request gets made again, but the service worker has it at the ready, which accounts for why I’m seeing some performance improvements even with the bug.

Now, if we ignore the prefetch specific issues here, we do still have an issue with the way Netlify handles the Age header. Netlify is, interestingly, both the CDN and the origin here. Typically, whenever the CDN has to revalidate that a resource is still fresh with the origin, it will reset the Age header back to 0.

In this case, because Netlify essentially is our origin, there’s no other layer somewhere for Netlify to revalidate with. The buck stops here, or something like that.

By passing the Age header along, and only updating it when the content is changed or cache is explicitly cleared, Netlify creates a situation where the browser will always have to go back to the server (Netlify) to see if the resource is fresh, regardless of that max-age window. The only way around this is to set a very long max-age or make sure to clear your Netlify cache on a semi-regular basis.

I suspect Netlify shouldn’t be passing the Age header down at all. Or, if that header is being applied at their edge layer (I’m not 100% clear on their architecture), then whenever their edge layer has to revalidate with the original source, they should be updating that Age at that point to avoid the issue of an ever-increasing Age.

Where do we go from here?

So, how do we make sure that our prefetched resources are as performant as possible?

First things first: measure. I tried to emphasize this in my last post, but the data about the impact of this approach on my site says nothing about the impact on other sites. In my situation, I’m seeing a small improvement in most situations even with the bug in place. Your mileage may vary. Testing performance changes is good.

From the Chromium side of things, don’t worry about it. Yoav was all over it, and a fix has already landed.

Firefox, however, is another story. It seems they’ve been contemplating making this change for awhile now, so it’s a matter of prioritizing the work. In the meantime, there are a few things to keep in mind.

One, if you have a service worker in place and you’re using an approach where the service worker serves from the cached version first, that helps to offset the double request penalty you might otherwise pay. The first request puts it in the service worker cache, the second gets pulled from there before it has to go any further.

If you don’t have a service worker in place, then you’re going to have to make a decision regarding the Age header.

If you don’t pass the Age header, then Firefox can cache the resource according to your cache headers regardless of whether the age of the resource on the CDN (or proxy) is longer than the max-age communicated to the browser, but it does introduce the risk of extending the total TTL as we saw above. If your max-age directive is set to a short duration and you can quickly purge the upstream cache, you reduce the pain here a little.

If you do pass the Age header along, you avoid longer total TTL issues, but you now risk issuing double requests for every prefetched resource as the age of the cached resource gets older. If the resource changes frequently in the upstream cache, or if you are passing a long max-age directive to the browser, the severity of this risk is reduced a little.

In the end, this comes down to a combination of what services and tools you’re using for those upstream caches, and how frequently your prefetched resources may change.