Do You Actually Need Redis? Caching Decisions for Real Apps
Redis gets added to architectures by reflex. Here is a senior dev decision framework for when a cache earns its place, and when Postgres is enough.
On this page
I have reviewed maybe forty architecture diagrams in the last two years, and roughly thirty of them had a Redis box wired into the middle of everything. When I ask the team what it does, the answer is usually some variant of "caching" followed by a pause. Then we look at the dashboards and Redis is serving four hundred requests a minute against a database doing thirty thousand. The cache is not earning its keep. It is a second stateful system you now have to monitor, secure, fail over, and reason about during incidents, and it bought you nothing.
Redis is a genuinely great piece of software. That is exactly why it gets added by reflex. This post is the decision framework I actually use: when a cache earns its place, when Postgres is already enough, and how to run a cache correctly if you do need one.
Start by assuming you do not need it
The default answer for a new app should be no Redis. Postgres at a modest instance size handles a staggering amount of load before it breaks a sweat, and it has caching layers most people forget exist.
The first is the OS page cache. Your hot rows and indexes live in RAM whether you asked for it or not. The second is shared_buffers, Postgres's own buffer pool. A primary-key lookup that hits a warm buffer returns in well under a millisecond. You do not need a network round trip to Redis to beat that; you need the query to be indexed and the working set to fit in memory.
Before you reach for Redis, exhaust these in order:
1. Index the query and check the plan. Most "we need a cache" problems are actually a sequential scan nobody noticed.
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, title, price
FROM products
WHERE category_id = 42
ORDER BY popularity DESC
LIMIT 20;If you see Seq Scan over a large table, or Buffers: read= numbers in the thousands, you have a missing index, not a missing cache. A composite index on (category_id, popularity DESC) can turn a 200ms query into a 2ms one. That is a bigger win than Redis and it removes work instead of adding a system.
2. Use a materialized view for expensive aggregates. If the slow thing is a dashboard rollup or a leaderboard computed over millions of rows, precompute it inside the database.
CREATE MATERIALIZED VIEW category_stats AS
SELECT category_id,
COUNT(*) AS product_count,
AVG(price) AS avg_price,
MAX(updated_at) AS last_updated
FROM products
GROUP BY category_id;
CREATE UNIQUE INDEX ON category_stats (category_id);
-- Refresh without locking readers:
REFRESH MATERIALIZED VIEW CONCURRENTLY category_stats;You refresh it on a schedule or after a relevant write. Reads are now a cheap indexed scan over a tiny table. No extra infrastructure, and the cached data is transactionally consistent with the rest of your database.
3. Cache in-process. If you have a small set of values read constantly and changed rarely, feature flags, currency rates, a config table, an LRU map in application memory is faster than any network cache and impossible to get a stampede on for a single instance. The tradeoff is per-instance staleness; bound it with a short TTL and move on.
4. Cache at the HTTP layer. For read-heavy public endpoints, the right cache is often a CDN or a Cache-Control header, not a key-value store your app has to manage. Let the edge serve it.
// Next.js route handler — let the CDN do the caching
export async function GET() {
const data = await getCategoryStats();
return Response.json(data, {
headers: {
"Cache-Control": "public, s-maxage=60, stale-while-revalidate=300",
},
});
}s-maxage=60 means the edge serves a cached copy for a minute; stale-while-revalidate=300 means for the next five minutes it serves the stale copy instantly while it refreshes in the background. Your origin sees one request per minute per region instead of thousands. You did not write a single line of cache code.
When you genuinely do need Redis
There is a real list, and the common thread is that these are things Postgres is bad at or that require shared state across instances at high write rates.
| Use case | Why Postgres struggles | Redis fit |
|---|---|---|
| Rate limiting | Per-request writes hammer WAL; row contention | Atomic counters, native TTL |
| Sessions at scale | High write churn, short-lived rows bloat tables | Volatile by design |
| Hot-key caching | One row read 50k times/sec saturates a backend | In-memory, microsecond reads |
| Queues / pub-sub | LISTEN/NOTIFY and SKIP LOCKED work but cap out | Streams, lists, pub-sub built in |
| Leaderboards | ORDER BY score over millions per request is costly | Sorted sets, O(log n) rank |
| Distributed locks | pg_advisory_lock ties up a connection | SET NX PX, lightweight |
If your need is on this list and at meaningful scale, Redis is the right tool and you should stop apologizing for it. Rate limiting is the clearest case: you want an atomic increment-and-expire that does not write to disk on every API call. Here is a token-bucket limiter that runs the whole decision in one round trip with a Lua script, so it is atomic without a transaction:
import { Redis } from "ioredis";
const redis = new Redis(process.env.REDIS_URL!);
// Token bucket: `capacity` tokens, refilled at `refillPerSec`.
// Returns true if the request is allowed.
const SCRIPT = `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local cost = tonumber(ARGV[4])
local b = redis.call("HMGET", key, "tokens", "ts")
local tokens = tonumber(b[1]) or capacity
local ts = tonumber(b[2]) or now
local delta = math.max(0, now - ts)
tokens = math.min(capacity, tokens + delta * refill)
local allowed = tokens >= cost
if allowed then tokens = tokens - cost end
redis.call("HSET", key, "tokens", tokens, "ts", now)
redis.call("EXPIRE", key, math.ceil(capacity / refill) + 1)
return allowed and 1 or 0
`;
export async function allow(userId: string): Promise<boolean> {
const res = await redis.eval(
SCRIPT, 1,
`rl:${userId}`,
"100", // capacity
"10", // 10 tokens/sec
(Date.now() / 1000).toString(),
"1", // cost per request
);
return res === 1;
}Doing this in Postgres means a write per request and lock contention on the counter row under load. Redis does it in memory with native key expiry. This is the kind of workload it was built for.
If you cache, do it correctly
Most cache bugs are not Redis bugs. They are caused by getting two things wrong, and Phil Karlton was right that they are the two hard problems in computer science: naming things and cache invalidation.
Naming. Pick a key scheme and write it down: entity:id:version, for example user:8821:v3. Bake a schema version into the key so a deploy that changes the shape of cached data does not serve garbage to the old code. Never build keys ad hoc at call sites; centralize them in one module so invalidation is greppable.
Invalidation. The default strategy should be cache-aside with a TTL. The app checks the cache, falls back to the database on a miss, and writes the result back with an expiry. The TTL is your safety net: even if you forget to invalidate on a write, the data self-heals within a bounded window. Write-through and read-through are tempting but couple your cache and database write paths; for most apps, cache-aside plus a short TTL plus explicit deletion on known writes is the pragmatic choice.
The trap everyone hits at scale is the cache stampede, also called the thundering herd. A popular key expires, and in the same millisecond a thousand requests miss, all stampede the database with the identical expensive query, and your origin falls over. The fixes:
- Add jitter to TTLs so keys do not all expire at the same instant.
- Use a lock so only one request recomputes while the others wait or serve stale data.
- Serve stale-while-revalidate so readers never block on a miss.
Here is a cache-aside helper that combines all three. One caller wins a short lock and recomputes; everyone else gets the slightly stale value instead of stampeding the database.
type Loader<T> = () => Promise<T>;
async function cached<T>(
key: string,
ttlSec: number,
load: Loader<T>,
): Promise<T> {
const raw = await redis.get(key);
if (raw) return JSON.parse(raw) as T;
// Miss: try to win the recompute lock.
const lockKey = `lock:${key}`;
const gotLock = await redis.set(lockKey, "1", "EX", 10, "NX");
if (!gotLock) {
// Someone else is recomputing. Briefly wait for them, then retry.
await new Promise((r) => setTimeout(r, 50));
const retry = await redis.get(key);
if (retry) return JSON.parse(retry) as T;
// Fall through and compute ourselves rather than hang.
}
try {
const value = await load();
// Jitter the TTL by +/-10% to desynchronize expiries.
const jitter = Math.floor(ttlSec * (0.9 + Math.random() * 0.2));
await redis.set(key, JSON.stringify(value), "EX", jitter);
return value;
} finally {
if (gotLock) await redis.del(lockKey);
}
}
// Usage:
const stats = await cached("category:42:stats:v2", 60, () =>
db.getCategoryStats(42),
);For genuinely hot keys, go one step further: store the value with a longer hard TTL plus a separate "logical" expiry timestamp, and refresh in the background when you cross the logical expiry while still serving the existing value. That is the true stale-while-revalidate pattern and it means no reader ever blocks on a recompute.
Managed versus self-hosted
If you have decided you need Redis, the next question is whether to run it yourself.
For most teams in 2026, a managed serverless cache like Upstash is the right call. You get a REST and TCP endpoint, pay per request rather than per uptime hour, and there is no failover, patching, or memory-pressure eviction tuning to own.
import { Redis } from "@upstash/redis";
const redis = Redis.fromEnv(); // UPSTASH_REDIS_REST_URL + TOKEN
await redis.set("user:8821:v3", JSON.stringify(user), { ex: 300 });Self-host only when the economics flip: sustained high-throughput workloads where per-request pricing exceeds the cost of a dedicated instance, sub-millisecond latency requirements that need Redis colocated in your VPC, or compliance constraints that forbid a third party. At that point you own persistence config, eviction policy (maxmemory-policy allkeys-lru for a pure cache), replication, and on-call. A rough rule: if your cache traffic is bursty and unpredictable, managed wins on cost and operations; if it is steady and heavy, a self-hosted instance is cheaper per request but you pay in operational time.
The decision checklist
Walk this top to bottom. Stop at the first row that solves your problem.
- Is the query indexed? Run
EXPLAIN ANALYZE. Fix the plan before adding anything. Most cache needs evaporate here. - Is it an expensive aggregate that changes slowly? Use a materialized view with
REFRESH CONCURRENTLY. - Is it a tiny, near-static dataset? Cache it in process with a short TTL.
- Is it a read-heavy public endpoint? Cache at the CDN with
s-maxageandstale-while-revalidate. - Is your actual need rate limiting, sessions at scale, hot-key caching, queues, leaderboards, or distributed locks? Now Redis earns its place.
- If you cache in Redis: versioned key names, cache-aside with a jittered TTL, and stampede protection via lock or stale-while-revalidate. Centralize key construction so invalidation is one grep away.
- Managed or self-hosted? Bursty traffic, managed (Upstash). Steady heavy load or colocation needs, self-hosted with explicit eviction and persistence config.
Redis is not a status symbol and it is not a default. It is a specialized tool that is spectacular at a specific list of jobs and pure overhead everywhere else. The senior move is not adding it; it is being able to explain, in one sentence, exactly which row on that checklist it is solving. If you cannot, delete the box from the diagram and let Postgres do its job.