Skip to content
Published on

Complete Guide to CDN & Edge Caching Strategies 2025: Cache Invalidation, ETag, Stale-While-Revalidate

Authors

TL;DR

  • Two hard things in computer science: cache invalidation, naming things, off-by-one errors (Phil Karlton)
  • Master HTTP cache headers: Cache-Control, ETag, Vary, If-None-Match
  • CDN cache = globally distributed cache: response from the PoP closest to the user, sub-50ms latency
  • Stale-While-Revalidate (SWR): instant return of stale responses while revalidating in the background — feels like 100% cache hit ratio
  • Edge computing: goes beyond caching to run code — Cloudflare Workers, Fastly Compute@Edge

1. Why Is Caching Hard?

1.1 Phil Karlton's Quote

"There are only two hard things in Computer Science: cache invalidation and naming things."

The joke is half true. Cache invalidation really is hard.

1.2 Two Fundamental Difficulties

1. When do you invalidate?

  • Too soon → no cache benefit
  • Too late → stale data problem

2. How do you invalidate?

  • Single key? Pattern? Everything?
  • Synchronous? Asynchronous?
  • What if it fails?

1.3 The Consistency/Performance Trade-off

Strong consistency  ←─────────────────→  High performance
    (real-time)                              (cache hit)

Every system sits somewhere on this spectrum. You cannot have both at 100%.


2. Mastering HTTP Cache Headers

2.1 Cache-Control — The Core of Caching

Cache-Control: public, max-age=3600, s-maxage=86400

Directives:

ValueMeaning
publicBoth CDN and browser can cache
privateBrowser only (user-specific)
no-cacheCacheable, but revalidation required each time
no-storeNever cached (sensitive data)
max-age=NBrowser cache lifetime (seconds)
s-maxage=NCDN cache lifetime (overrides browser)
must-revalidateAfter expiration, must revalidate
immutableNever changes (CSS, JS with hash)
stale-while-revalidate=NStale usable for N seconds after expiration
stale-if-error=NOn error, stale usable for N seconds

2.2 Common Patterns

HTML (changes frequently):

Cache-Control: public, max-age=0, must-revalidate

API JSON (can change):

Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=86400

Static assets (with hash in filename):

Cache-Control: public, max-age=31536000, immutable

Sensitive data (user info):

Cache-Control: private, no-store

2.3 ETag — Efficient Validation

The server responds with a unique identifier (hash) for the content:

HTTP/1.1 200 OK
ETag: "abc123"
Cache-Control: max-age=3600

After cache expiry the client sends:

GET /api/users/123 HTTP/1.1
If-None-Match: "abc123"

If the content is unchanged:

HTTP/1.1 304 Not Modified
ETag: "abc123"

304 Not Modified = no body. Saves bandwidth.

2.4 Last-Modified — Time-Based

HTTP/1.1 200 OK
Last-Modified: Tue, 15 Apr 2025 10:00:00 GMT
GET /api/users/123 HTTP/1.1
If-Modified-Since: Tue, 15 Apr 2025 10:00:00 GMT

Weaker than ETag: can't detect sub-second changes, depends on clock synchronization.

2.5 Vary — Defining Cache Equivalence

HTTP/1.1 200 OK
Cache-Control: public, max-age=3600
Vary: Accept-Encoding, Accept-Language

Meaning: "Even with the same URL, different Accept-Encoding (gzip vs br) or Accept-Language (en vs ko) means a different cache entry."

Warning: Vary: User-Agent yields nearly infinite variants → cache becomes useless. Never do this.

2.6 Common Mistakes

1. no-cache does NOT mean "no caching"

  • no-cache: Cacheable, revalidation required before use
  • no-store: Absolutely not cached

2. Pragma: no-cache is a legacy header

  • Leftover from HTTP/1.0
  • Mostly ignored, though some keep it for compatibility

3. Vary: * effectively disables caching

  • "Every header differs" means every entry is unique — never a hit

3. How CDNs Work

3.1 What Is a CDN?

Content Delivery Network — a network of cache servers distributed globally.

User (Seoul)[CDN PoP Seoul] (cache hit!) → response
                        ↓ miss
                    [Origin server (US)]

Impact:

  • Seoul user: 5ms (CDN PoP)
  • Direct to US origin: 200ms (round-trip + processing)

3.2 CDN Components

ComponentRole
PoP (Point of Presence)City-level cache server (Cloudflare has 300+)
OriginThe source server (yours)
Edge CacheCache storage inside the PoP
DNSRoutes users to the nearest PoP
AnycastOne IP routes to multiple PoPs

3.3 Cache Key

What does the CDN use to identify a cached object?

Default: URL only.

GET /api/users/123  → cache entry
GET /api/users/123?lang=en  → different cache entry (query string included)

Cloudflare Workers:

const cacheKey = new Request(url, { method: 'GET' })
const cache = caches.default
const cached = await cache.match(cacheKey)

3.4 Cache Hit Ratio

The single most important metric.

hit_ratio = cache_hits / (cache_hits + cache_misses)
Hit RatioMeaning
95%+Excellent
90–95%Good
80–90%Room for improvement
Below 80%Cache strategy needs rethinking

1% improvement matters a lot: 96% → 97% = 25% fewer origin requests.

3.5 CDN Comparison

CloudflareFastlyCloudFrontAkamaiVercel/Netlify
PoPs300+80+600+4000+100+
Edge computeWorkers (V8)Compute@Edge (Wasm)Lambda@EdgeEdgeWorkersEdge Functions
Free tierGenerousNoNoNoYes
PriceVery cheapExpensiveModerateExpensiveModerate
Cache invalidationInstant150ms (Instant Purge)MinutesMinutesMinutes
DDoS protectionExcellentGoodGoodExcellentModerate

Cloudflare: the price/performance champion. Fits 90% of use cases. Fastly: fast invalidation, strong for media and news. CloudFront: strong AWS integration. Akamai: enterprise, the most PoPs.


4. Cache Invalidation — The Hardest Part

4.1 Four Invalidation Strategies

1. TTL-based: expires automatically after a while

Cache-Control: max-age=3600  // expires in 1 hour

Pros: simple. Cons: up to 1 hour of staleness.

2. Push (immediate invalidation): notify the CDN when content changes

curl -X POST https://api.cloudflare.com/zones/.../purge_cache \
  -H "Authorization: Bearer ..." \
  -d '{"files":["https://example.com/api/users/123"]}'

Pros: immediate. Cons: complexity, cost.

3. ETag-based: revalidate on every request (304 response)

  • Saves bandwidth, still has latency

4. Versioned URLs: a new URL on every change

/static/main.abc123.js/static/main.def456.js

Pros: no invalidation required. Standard for CSS/JS.

4.2 Cloudflare's Invalidation API

# Single URL
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
  -H "Authorization: Bearer {api_token}" \
  -d '{"files":["https://example.com/page1"]}'

# Tag-based (Enterprise)
curl -X POST ... -d '{"tags":["user-123"]}'

# Everything
curl -X POST ... -d '{"purge_everything":true}'

4.3 Cache Tags — Elegant Invalidation

Supported by Fastly and Cloudflare Enterprise:

HTTP/1.1 200 OK
Cache-Tag: user-123, post-456, blog-list

Invalidate:

curl -X POST https://api.fastly.com/service/{id}/purge/user-123

→ Instantly invalidates every cache entry tagged user-123.

Example: user updates their profile → invalidate every page tagged user-123.

4.4 Tricky Invalidation Scenarios

Scenario 1: user posts a comment

Affected pages:
- /post/123 (the post)
- /post/123/comments
- /user/abc/comments (author page)
- /sidebar/recent-comments
- ... and many more

→ Invalidate each one by hand? Tag-based invalidation is the answer.

Scenario 2: database consistency

DB transaction commit → invalidate cache
                ↓ what if invalidation fails?
            stale data!

Retry + dead letter queue. Or accept eventual consistency.

Scenario 3: invalidation storm

Black Friday = frequent price changes = invalidation storm → CDN API rate limits.

Batch invalidation, use tags, shorter TTL.


5. Stale-While-Revalidate — The Game Changer

5.1 The Problem

Traditional caching:

[cache expires][origin request][wait][response]
                        ↑ user is waiting (slow!)

5.2 Stale-While-Revalidate

Cache-Control: max-age=60, stale-while-revalidate=86400

Meaning:

  • For 60 seconds: serve fresh cache
  • From 60s to 86460s: serve stale immediately + refresh in the background
  • After 86460s: expired (go to origin)

5.3 User Experience

User 1 (60s):   [cache hit, fresh]   instant
User 2 (61s):   [cache hit, stale]   instant (background refresh starts)
User 3 (62s):   [cache hit, fresh]   instant (just refreshed)

Every user gets an instant response. Origin is called only occasionally.

5.4 Next.js ISR (Incremental Static Regeneration)

export async function getStaticProps() {
  const data = await fetchData()
  return {
    props: { data },
    revalidate: 60  // background regeneration after 60s
  }
}

Internally uses the SWR pattern. Static speed with dynamic data.

5.5 stale-if-error

Cache-Control: max-age=60, stale-if-error=86400

Meaning: on origin error, serve stale for up to 24 hours.

Effect: site stays up even when origin is down. Core of resilience.


6. Cache Patterns — At the Code Level

6.1 Cache-Aside (Lazy Loading)

def get_user(user_id):
    user = cache.get(f"user:{user_id}")
    if user is None:
        user = db.query(f"SELECT * FROM users WHERE id={user_id}")
        cache.set(f"user:{user_id}", user, ttl=3600)
    return user

The most common pattern. Simple and robust. Drawback: the first request on a miss is slow.

6.2 Read-Through

The cache calls origin directly (the app only talks to the cache).

# The cache library handles this automatically
user = cache.get(f"user:{user_id}", loader=lambda: db.query(...))

Pros: simple code. Cons: depends on the cache library.

6.3 Write-Through

Update the cache at the same time as the write.

def update_user(user_id, data):
    db.execute(f"UPDATE users SET ... WHERE id={user_id}")
    cache.set(f"user:{user_id}", data, ttl=3600)

Pros: cache always fresh. Cons: slow writes, caches data that may never be read.

6.4 Write-Behind (Write-Back)

Write to the cache only → flush to the DB asynchronously.

def update_user(user_id, data):
    cache.set(f"user:{user_id}", data, ttl=3600)
    queue.push({"action": "update", "id": user_id, "data": data})

Pros: very fast writes. Cons: risk of data loss if the cache fails.

6.5 Preventing Cache Stampede

Problem: when a hot key expires, thousands of requests hit origin at once.

Solutions:

1. Mutex (distributed lock)

def get_user(user_id):
    user = cache.get(f"user:{user_id}")
    if user is None:
        with cache.lock(f"lock:user:{user_id}", timeout=10):
            user = cache.get(f"user:{user_id}")  # double-check
            if user is None:
                user = db.query(...)
                cache.set(f"user:{user_id}", user, ttl=3600)
    return user

2. Probabilistic Early Expiration Right before TTL, some requests refresh ahead of time:

def get_with_recompute(key):
    value, ttl = cache.get_with_ttl(key)
    if random() < ttl_factor(ttl):
        # only some requests recompute early
        recompute_async(key)
    return value

3. Stale-While-Revalidate (as covered above)


7. Edge Computing — Beyond Caching

7.1 Running Code at the Edge

The CDN is no longer just a cache. Code runs there too:

  • Cloudflare Workers (V8 isolate)
  • Fastly Compute@Edge (WebAssembly)
  • AWS Lambda@Edge (Node.js, Python)
  • Vercel Edge Functions (V8)
  • Deno Deploy (V8)

7.2 Use Cases

1. A/B testing

export default {
  async fetch(request) {
    const variant = Math.random() < 0.5 ? 'a' : 'b'
    return fetch(`https://origin.com/page-${variant}`)
  }
}

2. Geo routing

const country = request.cf.country
const origin = country === 'KR' ? 'asia.api.com' : 'us.api.com'
return fetch(origin + new URL(request.url).pathname)

3. Auth/authorization

const token = request.headers.get('Authorization')
const user = await verifyJWT(token)
if (!user) return new Response('Unauthorized', { status: 401 })
return fetch(origin)

4. HTML transformation

const response = await fetch(origin)
return new HTMLRewriter()
  .on('h1', { element(el) { el.setInnerContent('Modified!') } })
  .transform(response)

5. API aggregation

const [user, posts] = await Promise.all([
  fetch('https://api.com/user/123').then(r => r.json()),
  fetch('https://api.com/posts?user=123').then(r => r.json())
])
return new Response(JSON.stringify({ user, posts }))

7.3 Edge Computing Limits

  • CPU time limit: typically 10–50ms (Cloudflare Workers 50ms)
  • Memory limit: 128MB (Cloudflare), 50MB (Fastly)
  • Cold start: almost zero on V8 isolates (under 5ms)
  • Database access: calling origin DB from the edge is slow → need an edge DB (Turso, D1, Neon)

7.4 Edge Data Stores

CloudflareFastlyVercel
KVWorkers KVObject StoreKV
DBD1 (SQLite)-Neon, Turso
ObjectR2-Blob
CacheCache APICache-
VectorVectorize--

8. CDN Cache Best Practices

8.1 Static Assets

# CSS, JS (hashed filenames)
Cache-Control: public, max-age=31536000, immutable

# Images (rarely change)
Cache-Control: public, max-age=86400, stale-while-revalidate=604800

8.2 HTML

# Changes often, but cacheable
Cache-Control: public, max-age=0, s-maxage=300, stale-while-revalidate=86400
  • max-age=0: browser revalidates every time
  • s-maxage=300: CDN caches for 5 minutes
  • stale-while-revalidate=86400: stale usable for 24 hours

8.3 API JSON

# Per-user data (private)
Cache-Control: private, max-age=60, must-revalidate

# Public API
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=86400

8.4 Per-User Content

# Never cached
Cache-Control: private, no-store

Or include Cookie in the cache key (risky — cardinality explosion).

8.5 Compression + Caching

HTTP/1.1 200 OK
Content-Encoding: gzip
Cache-Control: public, max-age=3600
Vary: Accept-Encoding

Vary: Accept-Encoding is required — gzip vs br vs none are different cache entries.


9. Monitoring and Debugging

9.1 Checking Response Headers

curl -I https://example.com/api/users/123

HTTP/2 200
cf-cache-status: HIT
age: 234
cache-control: public, max-age=3600

Key headers:

  • cf-cache-status: HIT, MISS, EXPIRED, BYPASS, REVALIDATED
  • age: seconds since cached
  • x-cache: AWS CloudFront header

9.2 Cache States

HIT           ← served from cache
MISS          ← went to origin
EXPIRED       ← expired, refreshing
REVALIDATED   ← validated via 304
BYPASS        ← cache bypassed (no-cache, etc.)
DYNAMIC       ← uncacheable

9.3 Tracking Hit Ratio

Most CDNs expose it in their dashboard:

  • Cloudflare Analytics
  • Fastly Insights
  • CloudFront Reports

9.4 Debugging Flow

Cache not hitting? Check:

  1. Cache-Control header: any no-cache or private?
  2. Set-Cookie: presence of cookies disables caching
  3. Vary: too many variants
  4. Method: POST is usually not cached
  5. Status code: only 200, 301, 302, 404, 410 are cached
  6. Query string: is the CDN set to ignore query strings?

10. Real-World — Caching a Blog Site

10.1 Scenario

  • WordPress or Next.js blog
  • 1 million page views per day
  • Posts get updated frequently
  • Comment system

10.2 Caching Strategy

HTML pages:

Cache-Control: public, max-age=0, s-maxage=3600, stale-while-revalidate=86400

Static assets:

Cache-Control: public, max-age=31536000, immutable

API JSON:

Cache-Control: public, max-age=300, stale-while-revalidate=86400

10.3 Invalidation

On post update:

  1. Tag invalidation: post-id, category-slug, homepage
  2. New URLs: hash changes for static assets
// Next.js + Cloudflare
async function updatePost(id, data) {
  await db.update(...)
  await cf.purgeByTags([`post-${id}`, 'homepage'])
}

10.4 Results

  • Cache hit ratio: 95%+
  • Origin requests: 50k/day (95% reduction)
  • Response time: 50ms (CDN) vs 500ms (origin)
  • Cost: 95% reduction

Quiz

1. What's the difference between no-cache and no-store?

Answer: no-cache: cacheable, but the client must revalidate with origin (If-None-Match) before use. If it gets a 304, the cache can be served. Surprisingly, the cache is still useful. no-store: cache is absolutely forbidden. Neither the request nor the response is stored anywhere (memory, disk). Use this for sensitive data. A common mistake: using no-cache when you actually want "don't cache" — in reality the item is cached and simply revalidated.

2. Why is Stale-While-Revalidate a game changer?

Answer: It gives users an instant response every time. Even after the cache expires, stale data is returned immediately while the refresh happens in the background. Users don't wait, and the next request sees fresh data. Result: a perceived cache hit ratio of 100% while origin traffic stays low. Next.js ISR uses this pattern. SWR largely resolves the classic trade-off between freshness and speed.

3. Name three ways to prevent cache stampede.

Answer: (1) Mutex / distributed lock — for an expired key, only one request goes to origin while the others wait (or read stale), (2) Probabilistic Early Expiration — some requests refresh ahead of TTL, (3) Stale-While-Revalidate — serve the expired entry and refresh in the background. The hotter the key, the more important these protections are. Reddit and Facebook use similar patterns.

4. What happens when CDN cache hit ratio improves by 1%?

Answer: Origin traffic drops by 25% or more. 95% → 96% means origin requests go from 5% to 4% = 20% reduction. 96% → 97% means 4% → 3% = 25% reduction. Every extra 1% is worth more as hit ratio climbs. This affects (1) server cost, (2) bandwidth cost, (3) latency, and (4) reliability. Even a 0.5% improvement carries real business value.

5. What does "edge computing goes beyond caching" mean?

Answer: CDNs no longer just cache static files. They run code on nodes close to the user. Use cases: A/B testing, geo routing, auth, HTML transformation, API aggregation. Result: even dynamic content responds under 50ms. Cloudflare Workers runs on V8 isolates with cold starts under 5ms. Fastly Compute@Edge runs Wasm. The origin server's role shrinks while the edge does more of the work.


References