Caching Strategies for Node.js APIs: Redis, In-Memory, and Edge
Five caching layers I use in production Node.js + Next.js APIs — in-memory LRU, Redis, edge CDN, stale-while-revalidate, and request coalescing. With when each one matters.
Cache or scale?
Most "we need to scale our backend" conversations end with: you needed cache.
A well-cached API serves 100x its uncached throughput from the same hardware. Knowing where to cache, what to cache, and how to invalidate it is a senior-level skill that most mid-level engineers haven't formalized.
Five layers I reach for, in order of latency win.
Layer 1: In-memory LRU cache
A Map inside your Node.js process. Cheapest, fastest possible cache. Survives only as long as the process.
Use for:
Don't use for:
Library: lru-cache (the canonical one). 10 minutes to integrate.
Layer 2: Redis
The workhorse of API caching. Network-attached, shared across all your workers.
Use for:
Don't use for:
Library: ioredis for the client. Hosted: Upstash (serverless-friendly, generous free tier), Redis Cloud (more mature), or self-hosted on Railway.
Layer 3: Edge or CDN caching
Cache responses at the CDN layer (Cloudflare, Vercel's edge cache, CloudFront). Highest latency win — response served from the user's geographic region, never hitting your origin.
Use for:
Don't use for:
Configure via Cache-Control headers (s-maxage, stale-while-revalidate). Vercel and Cloudflare both respect these.
Layer 4: Stale-while-revalidate (SWR)
A pattern, not a layer. When data is stale, serve the stale version immediately AND kick off a background refresh.
Use for:
Don't use for:
Implementation: Next.js revalidate config does this natively. Manual: serve cache, dispatch refresh fire-and-forget.
Layer 5: Request coalescing
When 100 concurrent requests ask for the same uncached data, naively you fire 100 DB queries. Coalescing means you fire one query, and 99 requests wait for its result.
Use for:
Implementation: wrap your fetch function with a "dedupe map" — same key, same in-flight promise. promise-memoize or hand-rolled in 20 lines.
This is the difference between a graceful cache miss and a thundering herd that takes down your DB.
A real-world stacking example
A SaaS dashboard endpoint that returns the user's recent activity:
Result: from around 80ms per request (cache miss) to around 3ms (in-memory hit), with the DB seeing maybe 1 query per 30 seconds per user.
Invalidation: the hard part
Caching is easy. Invalidating correctly is hard.
TTL-based (lazy)
The simplest. Cache expires after N seconds. Acceptable staleness in exchange for zero invalidation logic. Use when staleness up to TTL is acceptable.
Event-driven (eager)
When data mutates, you publish an invalidation event. Listeners flush related cache keys. Use when you need consistency within seconds of a write.
Cache tags (Next.js)
Next.js's revalidateTag is the cleanest pattern I've seen. Tag a fetch with a label, later call revalidateTag(label) to invalidate everything with that tag. Best invalidation primitive I've used.
What I don't cache
Things I see over-cached:
TL;DR
If your Node.js or Next.js API is hitting scaling limits and you want a senior to architect the caching strategy, contact me.
You might also like
Background Jobs in Node.js 2026: BullMQ, Trigger.dev, or Inngest?
Compared on real client projects: BullMQ vs Trigger.dev vs Inngest for Node.js background jobs. What I pick for what, with cost, DX, and operational trade-offs.
Building a Production REST API with Node.js and Express in 2026
Layered architecture, validation, error handling, auth, rate limiting, observability — the patterns I use to ship Node.js + Express APIs that don't fall over in production.
Building Production AI Agents with Claude 4.7 and Tool Use
What I learned shipping AI agents to production: tool design, prompt structure, durable execution, observability, and cost control. Practical patterns from real client work.