Web Caching Explained: Browser, CDN, and Server

I once cut a client's median page load from 2.4s to 380ms without touching a single line of application code. The fix was three HTTP headers and one CDN setting. That is the dirty secret of web performance: the biggest wins almost never come from faster code. They come from not running the code at all, because something upstream already had the answer cached.

The trouble is that "caching" is not one thing. It is four layers stacked on top of each other, each with its own rules, each capable of serving you stale garbage if you configure it wrong. Most engineers I work with understand maybe one and a half of these layers. Let me map all four, then give you the directives and decision rules that actually matter in production.

The four layers, from client to origin

A request for an asset can be answered at four different places. The closer to the user it is answered, the faster and cheaper it is.

Layer	Lives where	Controlled by	Typical hit latency
Browser cache	The user's disk/memory	`Cache-Control`, `ETag`	~0ms (no network)
CDN / edge cache	~100+ PoPs near the user	Origin headers + CDN config	10-40ms
Reverse proxy	In front of your origin (nginx, Varnish)	Proxy config	1-5ms (LAN)
Application / data cache	Inside your app (Redis, in-memory)	Your code	sub-ms to a few ms

The first two are governed largely by HTTP response headers. The last two you control directly in config and code. The whole game is pushing each request as far left in that table as you safely can, without ever serving something that should have changed.

The browser cache, and the directives that matter

When the browser stores a response, it later faces one question: can I reuse this without asking the server? The Cache-Control response header answers it. Everything else is detail around that decision.

Here is the directive set that earns its keep, and the ones people misuse.

max-age=<seconds> is the freshness lifetime. For max-age=3600, the browser reuses the response for an hour with zero network traffic. After that the response is stale and must be revalidated.

immutable tells the browser the bytes will never change for this URL, so it should not even revalidate on a hard reload. This is only safe with content-hashed filenames (more on that below). Without immutable, Chrome and Firefox still fire conditional revalidation requests on reload, which costs you a round trip even on a cache hit.

no-cache is the most misnamed directive in HTTP. It does not mean "do not cache". It means "cache it, but revalidate with the origin before every reuse". It is perfect for HTML.

no-store is the real "do not cache anywhere" directive. Use it for sensitive responses (authenticated account pages, anything with PII). It is the only directive that keeps a response off disk.

private vs public: private permits the browser to cache but forbids shared caches (CDN, proxy) from storing it. public allows shared caches even when the response would normally not be cacheable (e.g., it has an Authorization header).

stale-while-revalidate=<seconds> (RFC 5861) is the one I reach for most. It lets a cache serve a stale response immediately while it revalidates in the background. The user gets an instant response; the cache quietly freshens for the next visitor.

For a content-hashed JS bundle, this is the canonical header:

# app.a3f9c2e1.js — the hash IS the version, so cache forever
Cache-Control: public, max-age=31536000, immutable

One year, never revalidate. When the file changes, its name changes, so this URL is genuinely immutable. For HTML, you want the opposite posture:

# index.html — content changes under a stable URL, so always check
Cache-Control: no-cache

no-cache here means the browser keeps a copy but issues a conditional request every time, getting a cheap 304 Not Modified when nothing changed. That gives you instant updates and avoids re-downloading unchanged HTML.

Conditional revalidation: ETag and Last-Modified

When a response goes stale, the cache does not blindly re-download. It asks "has this changed?" using a validator. There are two.

An ETag is an opaque version token the origin computes for the response body, often a hash. The client echoes it back on the next request via If-None-Match. A Last-Modified date works the same way via If-Modified-Since, but at one-second granularity and only for time-based changes. ETag is stronger; prefer it.

Here is the full revalidation round trip, written as an Express handler so the flow is explicit:

import express, { Request, Response } from "express";
import { createHash } from "node:crypto";
 
const app = express();
 
app.get("/api/profile/:id", async (req: Request, res: Response) => {
  const data = await loadProfile(req.params.id);
  const body = JSON.stringify(data);
 
  // Strong validator derived from the body bytes.
  const etag = `"${createHash("sha256").update(body).digest("base64url")}"`;
 
  res.setHeader("Cache-Control", "private, no-cache");
  res.setHeader("ETag", etag);
 
  // Did the client send back a matching validator?
  if (req.headers["if-none-match"] === etag) {
    // Nothing changed — send 304 with no body. Saves the payload.
    return res.status(304).end();
  }
 
  res.status(200).type("application/json").send(body);
});
 
async function loadProfile(id: string) {
  return { id, name: "Pavle", updatedAt: Date.now() };
}
 
app.listen(3000);

The win on the 304 path is that you skip serializing and sending the body. For a 200KB JSON response that revalidates often, that is the difference between a 4ms response and a 40ms one. You still pay one round trip, which is why immutable (zero round trips) beats revalidation whenever you can content-hash the URL.

Cache-busting: the naming half of the hard problem

There is a famous Phil Karlton line: there are only two hard things in computer science, cache invalidation and naming things. Web caching makes you solve both at once, and they are the same problem viewed from two sides.

The clean solution for static assets is to make the name encode the content. Every bundler does this. With a [contenthash] in the filename, a byte change produces a new filename, which is a new URL, which the cache has never seen and must fetch. You never invalidate anything; you just stop referencing the old name.

// vite.config.ts — Vite 6 hashes asset filenames by default,
// but this makes the pattern explicit and stable.
import { defineConfig } from "vite";
 
export default defineConfig({
  build: {
    rollupOptions: {
      output: {
        entryFileNames: "assets/[name].[hash].js",
        chunkFileNames: "assets/[name].[hash].js",
        assetFileNames: "assets/[name].[hash][extname]",
      },
    },
  },
});

This is why the immutable, max-age=31536000 header on hashed assets is safe and the HTML that references them must be no-cache. The HTML is your source of truth for "which version of the world is live". It must always revalidate so that a deploy is visible immediately; the assets it points to can cache forever because their URLs are unique per build.

In Next.js (15/16), this split is handled for you: the framework emits hashed filenames under /_next/static/ and serves them with Cache-Control: public, max-age=31536000, immutable, while HTML and data are revalidated according to your caching config. If you serve your own static directory, set it yourself:

// next.config.ts — custom long-cache headers for a self-hosted /public asset dir
import type { NextConfig } from "next";
 
const config: NextConfig = {
  async headers() {
    return [
      {
        // Only apply to content-hashed files you control.
        source: "/assets/:path*.:hash.:ext(js|css|woff2|png|svg)",
        headers: [
          {
            key: "Cache-Control",
            value: "public, max-age=31536000, immutable",
          },
        ],
      },
    ];
  },
};
 
export default config;

The CDN layer: cache keys and purging

A CDN caches your origin's responses at edge locations worldwide. It mostly obeys the same Cache-Control headers as the browser, with two critical additions you have to understand.

First, the cache key. By default a CDN keys its cache on the request method, host, and full path-plus-query-string. That means /products?ref=twitter and /products?ref=email are two separate cache entries serving identical HTML, fragmenting your hit rate. Configure the CDN to strip or ignore tracking params from the key. Conversely, if a response varies by some input, that input must be in the key, either via query string or via a Vary response header. Vary: Accept-Encoding is standard; Vary: Cookie is almost always a mistake because it shatters the cache per user.

Second, separate edge TTLs. Most CDNs let you set a different TTL at the edge than in the browser, using s-maxage (the shared-cache directive) alongside max-age. This is the pattern for dynamic-but-cacheable pages:

# Browser revalidates often; the CDN holds it 60s and serves stale
# for up to a day while it refetches in the background.
Cache-Control: public, max-age=0, s-maxage=60, stale-while-revalidate=86400

That single header gives you near-static performance for a page that updates every minute, while every user always gets an instant response from the edge.

Now, invalidation, the genuinely hard half. You have three tools:

TTL expiry: the lazy default. Set a short s-maxage and let entries age out. Simple, but you trade freshness for it.
Purge by URL: explicit, surgical, slow at scale. Fine for a handful of pages.
Tag/surrogate-key purge: the professional answer. Tag responses with a Surrogate-Key (Fastly) or Cache-Tag (Cloudflare, CloudFront via headers) header, then purge everything bearing a tag in one call.

# Origin tags a product page with the entities it depends on.
Surrogate-Key: product-8842 collection-shoes homepage
 
# When product 8842 changes, purge every cached response that touched it,
# across the whole edge network, in one request.
curl -X POST "https://api.fastly.com/service/$SERVICE_ID/purge/product-8842" \
  -H "Fastly-Key: $FASTLY_API_TOKEN"

Tag-based purge is what lets you cache aggressively and stay correct. Your write path emits a purge for the affected tags whenever data changes, so the edge can hold content for hours yet reflect a database update within seconds.

What to cache, and what never to

Here is the decision framework I apply on every project.

Content-hashed static assets (JS, CSS, fonts, images with a hash in the name): cache as hard as physically possible. public, max-age=31536000, immutable. There is no downside; the URL changes when the bytes do.
HTML: cache carefully. no-cache for the browser so it always revalidates, plus a short s-maxage at the CDN if the page is shared across users. Never immutable HTML; a stale shell is how users end up loading a deleted bundle.
API / JSON responses: case by case. Public, read-heavy, slowly-changing data (a product catalog, a blog feed) caches beautifully with s-maxage plus tag purge. Per-user or write-heavy endpoints get private, no-cache with an ETag, or no-store if sensitive.
Anything authenticated or carrying PII: Cache-Control: no-store. No exceptions. A shared cache serving one user's account page to another is a data breach, not a performance bug. The OWASP guidance on sensitive data exposure is unambiguous here.

The checklist

Before you ship, walk this list:

Are all static assets content-hashed and served immutable, max-age=31536000?
Is HTML served no-cache (or short s-maxage) so deploys are visible instantly?
Do cacheable API responses carry an ETag and return 304 on revalidation?
Is anything authenticated or PII-bearing served no-store? Grep your responses to be sure.
Does your CDN cache key strip tracking params and avoid Vary: Cookie?
Do you use stale-while-revalidate so users never wait on a background refresh?
Does your write path purge by cache tag, not by guessing URLs?

Get those seven right and you will have done more for your users' experience than any amount of code micro-optimization. Caching is leverage: a few headers, applied at the right layer, beat a rewrite.