Last Updated:

Cache Me, Maybe? Shopify Caches and TTFB

If your Shopify store has moments where it feels like the page has "stalled" - it might not be you, it might be a subtle misstep in some crucial infrastructure between your customer and your website - the Shopify Caches. If your site is returned, not by the high-speed global network of "server caches" but instead by the Shopify backend itself - you might have a problem.

For the past few months, we have been tracking a cache miss event happening at Shopify based stores and how they can impact performance. The short story is this - when you update your theme or otherwise cause the Shopify servers to invalidate your cache (so customers see the new content) you may pay a 2-20x increase in page load time.

Read on for more details.

Server cache misses are the silent killers of page performance.

Jake Loveless, Co-Founder @ Edgemesh

Let's Talk About Server Caches 

Caching Servers have, basically, one job: return the copy of the code requested (as fast as possible) and if they do not have a copy of the latest version, go get it from the backend server and then return that. When the Caching Servers have the page locally, we call that a Cache hit - and the page returns really quickly. But when the server doesn't have the page, it needs to go get it. This is called a Cache miss - and it's (sometimes painfully) slow.

Cache HITS (fast responses) vs. Cache Misses (slow responses requiring the backend)
Cache HITS (fast responses) vs. Cache Misses (slow responses requiring the backend)

We can actually see if we got a cache hit or miss by looking at the x-cache header. This is a response from the server telling you if the page was returned by the fast cache server (hit, server) or returned by the slower backend server (miss). As an example, let's have a look at a fast Shopify based store and one of my favorite brands: https://www.allbirds.com.

Allbirds Example: Some Birds Don't Fly

Allbirds is a New Zealand-American footwear company which uses a direct-to-consumer approach and is aimed at designing environmentally friendly footwear. They're inexpensive, absurdly comfortable and sold direct to consumers on Shopify. I love my Allbirds so much - I will literally wear them until they fall apart (and even then for a few more days).

Allbirds worn until the limt
The best shoe ever made. Even when you wear them out!

 

Let's head over to the Allbirds website and see if we get a cache miss. There are legitimate reasons to get a cache miss, namely if the site has been updated recently and I was the first person to request the page since it was. That's not very likely since this is an extremely high traffic site! I go to https://www.allbirds.com and we see ...
Looking at the x-cache Header
Searching for cache misses with the x-cache header (Google Chrome Network view)

A cache miss? What's the effect on the page load. The key metric here is the Time to First Byte or TTFB. We can see the TTFB by looking at the timing details tab. Here we see a 500ms TTFB (1/2 a second!). Ouch!

Looking at Time to First Byte in the TIMING view
Looking at the Time to First Byte (TTFB) in the Network Timing View

How Bad is it really?

I took a tour around the page, and sure enough I was able to induce a stall more than once. On a heavily visited site like this, we'd expect the caches to be "hot" , meaning they are VERY likely to have all the page assets in memory and at the ready. I wrote a little script to test the TTFB using curl (works on OSX or Linux).

while true; do curl -I --silent --show-error --write-out 'lookup:        %{time_namelookup}\nconnect:       %{time_connect}\nappconnect:    %{time_appconnect}\npretransfer:   %{time_pretransfer}\nredirect:      %{time_redirect}\nstarttransfer: %{time_starttransfer}\ntotal:         %{time_total}\n' 'https://www.allbirds.com/' -H 'authority: www.allbirds.com' -H 'upgrade-insecure-requests: 1' -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36' -H 'sec-fetch-user: ?1' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' -H 'sec-fetch-site: same-origin' -H 'sec-fetch-mode: navigate' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: en-US,en;q=0.9,bg;q=0.8' --compressed  | grep 'x-cache\|total\|etag' ; DATE; sleep $[ ( $RANDOM % 10 )  + 1 ]s ; done >> allbirds.txt

This code will curl the Allbirds homepage, and extract the etag (essentially the checksum of the page), the x-cache header and the Total time it took to get the page (and just the HTML). Collating the results we can see that of the 631 page requests, 37.8% resulted in a cache miss. The difference here is about a 2x performance regression at both the average, 90th percentile and 95th percentile. 

x_cache    | samples avgTTFB   p90TTFB  p95TTFB  
-----------| ------------------------------------
hit, server| 392     0.3177815 0.569994 0.7486215
miss       | 239     0.51752   1.110882 1.413967

Looking around a few other sites with less traffic than Allbirds, we quickly realized this penalty can be much more severe when your cache isn't hot. 

Cache misses can be 2-20x slower responses. This same page was ~5 seconds until the first byte of data came back (ouch!)

How do we fix this?

Unfortunately, we havent found a straight forward solution to this problem. We do know a few things you can do that seem to help:

  1. Batch updates to you Shopify store: Each time you update your site, you instruct Shopify to 'drop the cache'. This is both expected AND desired. However you will experience higher cache miss rates after updates so try to make once push to production with multiple changes.
  2. Update your site during off traffic hours: When you do update the site, try to do so during low traffic periods and on low traffic days.
  3. Make the backend render faster (to deal with cache misses): This one is a bit harder, but you can use the new Shopify Chrome extension (see here) to analyze the render speed of your store.

If you have done all this, and you still experience slow Times to First Byte - you might be out of luck. When looking at the difference between subsequent page loads (to see if the HTML indeed DID change making the cache miss the right response) we can only find a single common thread. In each site we analyzed, there was always the following difference in the HTML - a `reqid` change in a Shopify performance tracking script (based on boomerang).

It is completely possible that this code is injected AFTER the site is cached, in which case it would not cause a cache miss - but it is a possible source of trouble.

< <script id="__st">var __st={"a":11044168,"offset":-28800,"reqid":"b313e50c-bc8a-4c64-bdf9-2ac30e247037","pageurl":"www.allbirds.com\/","u":"301c81c55a75","p":"home"};</script>
---
> <script id="__st">var __st={"a":11044168,"offset":-28800,"reqid":"23aecfa9-74d1-4c7b-9101-434dc85294d8","pageurl":"www.allbirds.com\/","u":"ad03ccee603f","p":"home"};</script>

At the moment, Edgemesh engineering is looking at solutions to enable client side caching of HTML. Once this feature is available, you will be able to hide these server side delays. Client side caches like Edgemesh don't succumb to these issues, as they return the HTML from the in-browser cache (look Ma no Servers!).

The reality is that caching is hard , both client side (like Edgemesh) or server side like caches. There is a old joke in computer science:

There are 2 hard problems in computer science:

cache invalidation, naming things, and off-by-1 errors. 

-- Phil Karlton