- Published on
Redis Complete Guide 2025: Caching Strategies, Data Structures, Pub/Sub, and Redis Stack
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- 1. Redis Overview
- 2. Five Core Data Structures
- 3. Advanced Data Structures
- 4. Caching Patterns
- 5. Cache Invalidation
- 6. Pub/Sub and Streams
- 7. Redis Stack
- 8. Lua Scripting
- 9. Redis Cluster
- 10. Redis in Practice
- 11. Redis vs Memcached vs DragonflyDB
- 12. Client Library Code Examples
- 13. Interview Questions (15)
- Q1. How is Redis fast despite being single-threaded?
- Q2. What is the difference between Cache-Aside and Write-Through?
- Q3. Explain Redis persistence mechanisms (RDB, AOF).
- Q4. Why might MGET fail in Redis Cluster?
- Q5. What is the Thundering Herd problem?
- Q6. What is the difference between Redis Pub/Sub and Streams?
- Q7. Why are big keys problematic in Redis?
- Q8. Explain Redis eviction policies.
- Q9. Why use Lua scripting in Redis?
- Q10. What is the difference between Redis Sentinel and Redis Cluster?
- Q11. What is Redis pipelining?
- Q12. How does Redis handle key expiration?
- Q13. What is the Redlock algorithm?
- Q14. What is Redis slow log?
- Q15. What are the pros and cons of Redis as a session store?
- 14. Quiz
- References
Introduction
Redis is the most widely used in-memory data store in 2025. Beyond simple caching, it serves as a message broker, session store, real-time leaderboard, rate limiter, and distributed lock. From Redis 7.4 features and the emergence of Redis Stack to the Valkey fork controversy, this guide covers everything about Redis.
1. Redis Overview
What is Redis?
Redis (Remote Dictionary Server) is an in-memory key-value data store. Since all data is stored in memory, it provides microsecond response times.
Redis 7.4 Key Features
- Redis Functions — Evolution of Lua scripts. Managed as libraries
- ACL v2 — Fine-grained access control
- Client-side caching — Client-side cache invalidation support
- Multi-part AOF — Improved persistence
The Valkey Fork Story
In 2024, Redis changed its license (SSPL + RSALv2), and the Linux Foundation forked Redis 7.2 as Valkey. AWS, Google, Oracle, and others support Valkey. Currently both projects maintain compatibility, but may diverge long-term.
2. Five Core Data Structures
2.1 String
The most basic data type. Can store up to 512MB.
# Basic SET/GET
SET user:1:name "Alice"
GET user:1:name
# Atomic increment/decrement
SET page:views 0
INCR page:views # 1
INCRBY page:views 10 # 11
# TTL setting
SET session:abc123 "user_data" EX 3600 # Expires in 1 hour
TTL session:abc123 # Check remaining time
# SET options
SET lock:resource "owner1" NX EX 30 # NX: Set only if key doesn't exist
SET user:1:name "Bob" XX # XX: Update only if key exists
Use cases: Session tokens, counters, temporary data, distributed locks
2.2 List
Doubly linked list. O(1) push/pop from both ends.
# Basic operations
LPUSH queue:emails "email1" "email2" "email3"
RPOP queue:emails # "email1" (FIFO queue)
# Range query
LRANGE queue:emails 0 -1 # All elements
# Blocking pop (message queue pattern)
BRPOP queue:emails 30 # Wait 30 seconds then pop
# Trimming (keep recent N items)
LPUSH notifications:user1 "new_msg"
LTRIM notifications:user1 0 99 # Keep only recent 100
Use cases: Message queues, recent activity feeds, job queues
2.3 Set
Unordered collection of unique elements. Supports set operations (union, intersection, difference).
# Basic operations
SADD tags:post:1 "python" "redis" "backend"
SADD tags:post:2 "python" "django" "orm"
# Membership check
SISMEMBER tags:post:1 "python" # 1 (true)
# Set operations
SINTER tags:post:1 tags:post:2 # "python" (intersection)
SUNION tags:post:1 tags:post:2 # union
SDIFF tags:post:1 tags:post:2 # elements only in post:1
# Random sampling
SRANDMEMBER tags:post:1 2 # Random 2 elements
Use cases: Tag systems, unique visitor tracking, friend relationships, recommendation systems
2.4 Sorted Set (ZSet)
Collection of unique elements sorted by score. Ideal for leaderboards.
# Leaderboard implementation
ZADD leaderboard 1500 "player:alice"
ZADD leaderboard 2300 "player:bob"
ZADD leaderboard 1800 "player:charlie"
# Ranking query (descending by score)
ZREVRANGE leaderboard 0 2 WITHSCORES
# 1) "player:bob" 2) "2300"
# 3) "player:charlie" 4) "1800"
# 5) "player:alice" 6) "1500"
# Specific member rank (0-based)
ZREVRANK leaderboard "player:alice" # 2
# Score increment
ZINCRBY leaderboard 500 "player:alice" # 2000
# Range search
ZRANGEBYSCORE leaderboard 1500 2000 WITHSCORES
Use cases: Leaderboards, priority queues, time-ordered events, rate limiting
2.5 Hash
Map of field-value pairs. Ideal for representing objects.
# Store user profile
HSET user:1 name "Alice" email "alice@example.com" age "30" role "admin"
# Get individual field
HGET user:1 name # "Alice"
# Get all
HGETALL user:1
# Field increment
HINCRBY user:1 age 1 # 31
# Existence check
HEXISTS user:1 email # 1 (true)
# Multiple fields at once
HMGET user:1 name email role
Use cases: User profiles, configuration values, session data, shopping carts
3. Advanced Data Structures
3.1 HyperLogLog
Probabilistic data structure for estimating unique element counts. Uses only 12KB of memory to count up to 2^64 elements (0.81% error rate).
# Estimate unique visitors
PFADD visitors:2025-03-23 "user1" "user2" "user3"
PFADD visitors:2025-03-23 "user1" "user4" # user1 is duplicate
PFCOUNT visitors:2025-03-23 # 4
# Merge multiple days
PFMERGE visitors:week visitors:2025-03-23 visitors:2025-03-24
PFCOUNT visitors:week
3.2 Bitmap
Bit-level operations. Memory-efficient for boolean state tracking.
# Daily attendance check
SETBIT attendance:2025-03-23 1001 1 # User 1001 present
SETBIT attendance:2025-03-23 1002 1
SETBIT attendance:2025-03-23 1003 0 # Absent
# Check attendance
GETBIT attendance:2025-03-23 1001 # 1
# Count attendees
BITCOUNT attendance:2025-03-23 # 2
# Consecutive attendance (AND operation)
BITOP AND consecutive attendance:2025-03-22 attendance:2025-03-23
BITCOUNT consecutive
3.3 Geospatial
Location-based data. Supports radius search and distance calculation.
# Add locations (longitude, latitude)
GEOADD stores 126.9784 37.5665 "gangnam-store"
GEOADD stores 127.0276 37.4979 "samsung-store"
GEOADD stores 126.9316 37.5563 "hongdae-store"
# Distance calculation
GEODIST stores "gangnam-store" "hongdae-store" km # ~5.2 km
# Radius search (Redis 6.2+)
GEOSEARCH stores FROMLONLAT 126.9784 37.5665 BYRADIUS 10 km ASC COUNT 5
3.4 Redis Streams
Log-based message structure. Supports Consumer Groups similar to Kafka.
# Add messages
XADD events * type "order" user_id "123" amount "50000"
XADD events * type "payment" user_id "123" status "completed"
# Read
XRANGE events - + COUNT 10
# Create Consumer Group
XGROUP CREATE events analytics-group $ MKSTREAM
# Read as consumer
XREADGROUP GROUP analytics-group consumer-1 COUNT 5 BLOCK 2000 STREAMS events >
# ACK (processing complete)
XACK events analytics-group "1679000000000-0"
# Check pending messages
XPENDING events analytics-group
4. Caching Patterns
4.1 Cache-Aside (Lazy Loading)
The most common pattern. The application manages the cache directly.
import redis
import json
r = redis.Redis(host="localhost", port=6379, decode_responses=True)
def get_user(user_id: int) -> dict:
cache_key = f"user:{user_id}"
# 1. Check cache
cached = r.get(cache_key)
if cached:
return json.loads(cached)
# 2. Cache miss -> Query DB
user = db.query(User).filter(User.id == user_id).first()
if not user:
return None
# 3. Store in cache
r.setex(cache_key, 3600, json.dumps(user.to_dict()))
return user.to_dict()
def update_user(user_id: int, data: dict):
db.query(User).filter(User.id == user_id).update(data)
db.commit()
# Invalidate cache
r.delete(f"user:{user_id}")
4.2 Write-Through
Updates cache and DB simultaneously on writes.
def save_user(user_id: int, data: dict):
cache_key = f"user:{user_id}"
# Update DB and cache simultaneously
db.query(User).filter(User.id == user_id).update(data)
db.commit()
r.setex(cache_key, 3600, json.dumps(data))
4.3 Write-Behind (Write-Back)
Writes to cache first, then asynchronously persists to DB.
def save_user_async(user_id: int, data: dict):
cache_key = f"user:{user_id}"
# Write to cache first
r.setex(cache_key, 3600, json.dumps(data))
# Async DB persistence (Celery, etc.)
sync_to_db_task.delay(user_id, data)
4.4 Caching Pattern Comparison
| Pattern | Read Perf | Write Perf | Consistency | Complexity |
|---|---|---|---|---|
| Cache-Aside | High | Medium | Eventual | Low |
| Write-Through | High | Low | Strong | Medium |
| Write-Behind | High | High | Eventual | High |
| Read-Through | High | Medium | Eventual | Medium |
5. Cache Invalidation
5.1 TTL-Based
The simplest approach. Auto-expires after a set time.
SET product:123 "data" EX 300 # Expires in 5 minutes
5.2 Event-Based Invalidation
Explicitly delete cache when data changes.
def update_product(product_id: int, data: dict):
db.update(product_id, data)
# Delete all related caches
r.delete(f"product:{product_id}")
r.delete(f"product_list:category:{data['category_id']}")
r.delete("product_list:featured")
5.3 Versioned Keys
Include version in keys for bulk invalidation.
def get_product_list(category_id: int) -> list:
version = r.get(f"product_version:{category_id}") or "1"
cache_key = f"products:cat:{category_id}:v:{version}"
cached = r.get(cache_key)
if cached:
return json.loads(cached)
products = db.query(Product).filter(Product.category_id == category_id).all()
r.setex(cache_key, 3600, json.dumps([p.to_dict() for p in products]))
return [p.to_dict() for p in products]
def invalidate_category(category_id: int):
# Increment version to invalidate existing caches
r.incr(f"product_version:{category_id}")
5.4 Thundering Herd Prevention
Prevents many requests hitting DB simultaneously when cache expires.
import random
def get_with_jitter(key: str, ttl: int = 3600) -> dict:
cached = r.get(key)
if cached:
return json.loads(cached)
# Distributed lock — only one request queries DB
lock_key = f"lock:{key}"
if r.set(lock_key, "1", nx=True, ex=10):
try:
data = fetch_from_db(key)
# Add random jitter to TTL
jitter = random.randint(0, 300)
r.setex(key, ttl + jitter, json.dumps(data))
return data
finally:
r.delete(lock_key)
else:
# Another process is refreshing — wait briefly and retry
import time
time.sleep(0.1)
return get_with_jitter(key, ttl)
6. Pub/Sub and Streams
6.1 Pub/Sub Basics
# Subscriber
SUBSCRIBE notifications:user:123
# Publisher
PUBLISH notifications:user:123 "You have a new message!"
# Pattern subscription
PSUBSCRIBE notifications:*
# Python subscriber
import redis
r = redis.Redis()
pubsub = r.pubsub()
pubsub.subscribe("notifications:user:123")
for message in pubsub.listen():
if message["type"] == "message":
print(f"Received: {message['data']}")
6.2 Redis Streams vs Kafka
| Feature | Redis Streams | Kafka |
|---|---|---|
| Message persistence | Memory + AOF | Disk |
| Throughput | ~10K/sec | ~100K+/sec |
| Consumer Groups | Supported | Supported |
| Message replay | Supported | Supported |
| Partitioning | Not supported | Supported |
| Operational complexity | Low | High |
| Suitable scale | Small-Medium | Large |
6.3 Event-Driven Pattern with Streams
import redis
import time
r = redis.Redis(decode_responses=True)
# Publish event
def publish_event(stream: str, event_type: str, data: dict):
r.xadd(stream, {"type": event_type, **data}, maxlen=10000)
# Consumer Group processing
def consume_events(stream: str, group: str, consumer: str):
try:
r.xgroup_create(stream, group, id="0", mkstream=True)
except redis.ResponseError:
pass
while True:
messages = r.xreadgroup(
group, consumer,
{stream: ">"},
count=10,
block=5000,
)
for stream_name, entries in messages:
for msg_id, fields in entries:
try:
process_event(fields)
r.xack(stream_name, group, msg_id)
except Exception as e:
print(f"Error processing {msg_id}: {e}")
def process_event(fields: dict):
event_type = fields.get("type")
if event_type == "order_created":
handle_order(fields)
elif event_type == "payment_completed":
handle_payment(fields)
7. Redis Stack
Redis Stack adds JSON, Search, TimeSeries, and Bloom Filter modules to Redis.
7.1 RedisJSON
# Store JSON document
JSON.SET user:1 $ '{"name":"Alice","age":30,"address":{"city":"Seoul","zip":"06000"},"tags":["python","redis"]}'
# Path-based query
JSON.GET user:1 $.name # "Alice"
JSON.GET user:1 $.address.city # "Seoul"
# Partial update
JSON.SET user:1 $.age 31
JSON.ARRAPPEND user:1 $.tags '"fastapi"'
# Numeric increment
JSON.NUMINCRBY user:1 $.age 1
7.2 RediSearch (Full-Text Search)
# Create index
FT.CREATE idx:products
ON JSON
PREFIX 1 product:
SCHEMA
$.name AS name TEXT WEIGHT 5.0
$.description AS description TEXT
$.price AS price NUMERIC SORTABLE
$.category AS category TAG
# Add documents
JSON.SET product:1 $ '{"name":"Redis in Action","description":"Complete guide to Redis","price":45000,"category":"book"}'
JSON.SET product:2 $ '{"name":"Python Cookbook","description":"Python recipes and patterns","price":38000,"category":"book"}'
# Search
FT.SEARCH idx:products "Redis guide"
FT.SEARCH idx:products "@category:{book} @price:[30000 50000]"
FT.SEARCH idx:products "@name:(Python)" SORTBY price ASC
7.3 RedisTimeSeries
# Create time series
TS.CREATE sensor:temperature:1 RETENTION 86400000 LABELS sensor_id 1 type temperature
# Add data
TS.ADD sensor:temperature:1 * 23.5
TS.ADD sensor:temperature:1 * 24.1
TS.ADD sensor:temperature:1 * 22.8
# Range query
TS.RANGE sensor:temperature:1 - + COUNT 10
# Aggregation (5-minute average)
TS.RANGE sensor:temperature:1 - + AGGREGATION avg 300000
# Downsampling rule
TS.CREATERULE sensor:temperature:1 sensor:temperature:1:avg AGGREGATION avg 300000
8. Lua Scripting
8.1 Basic Lua Script
# Atomic read-modify-write
EVAL "
local current = redis.call('GET', KEYS[1])
if current then
local new_val = tonumber(current) + tonumber(ARGV[1])
redis.call('SET', KEYS[1], new_val)
return new_val
end
return nil
" 1 counter 5
8.2 Rate Limiter (Sliding Window)
RATE_LIMIT_SCRIPT = """
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Remove old requests outside window
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
-- Check current request count
local count = redis.call('ZCARD', key)
if count < limit then
-- Allow: add new request
redis.call('ZADD', key, now, now .. ':' .. math.random())
redis.call('EXPIRE', key, window)
return 1
else
-- Deny
return 0
end
"""
import redis
import time
r = redis.Redis()
rate_limit_sha = r.script_load(RATE_LIMIT_SCRIPT)
def is_allowed(user_id: str, limit: int = 100, window: int = 60) -> bool:
key = f"ratelimit:{user_id}"
now = int(time.time() * 1000)
result = r.evalsha(rate_limit_sha, 1, key, limit, window * 1000, now)
return bool(result)
8.3 Distributed Lock (Redlock Algorithm)
import redis
import time
import uuid
class DistributedLock:
def __init__(self, redis_client: redis.Redis, resource: str, ttl: int = 10):
self.redis = redis_client
self.resource = f"lock:{resource}"
self.ttl = ttl
self.token = str(uuid.uuid4())
def acquire(self) -> bool:
return bool(self.redis.set(
self.resource, self.token,
nx=True, ex=self.ttl,
))
def release(self) -> bool:
# Atomic check + delete with Lua script
script = """
if redis.call('GET', KEYS[1]) == ARGV[1] then
return redis.call('DEL', KEYS[1])
end
return 0
"""
return bool(self.redis.eval(script, 1, self.resource, self.token))
def __enter__(self):
if not self.acquire():
raise Exception("Could not acquire lock")
return self
def __exit__(self, *args):
self.release()
# Usage
r = redis.Redis()
with DistributedLock(r, "order:process:123"):
# Critical section — only one process executes
process_order(123)
9. Redis Cluster
9.1 Hash Slots
Redis Cluster distributes data across 16384 hash slots. The CRC16 hash of a key modulo 16384 determines the slot number.
Node configuration example:
Node A: slots 0-5460
Node B: slots 5461-10922
Node C: slots 10923-16383
With one replica per node:
Node A -> Replica A'
Node B -> Replica B'
Node C -> Replica C'
9.2 Cluster Setup
# Create cluster (6 nodes: 3 masters + 3 replicas)
redis-cli --cluster create \
127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 \
127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 \
--cluster-replicas 1
# Check cluster status
redis-cli -c -p 7001 cluster info
redis-cli -c -p 7001 cluster nodes
9.3 Failover and Resharding
# Manual failover
redis-cli -c -p 7004 cluster failover
# Resharding (move slots)
redis-cli --cluster reshard 127.0.0.1:7001
# Add node
redis-cli --cluster add-node 127.0.0.1:7007 127.0.0.1:7001
# Remove node
redis-cli --cluster del-node 127.0.0.1:7001 <node-id>
9.4 Hash Tags
Place keys in the same slot to enable multi-key operations.
# Hash tags — slot determined by content in curly braces
SET {user:1}:profile "data"
SET {user:1}:settings "data"
SET {user:1}:sessions "data"
# These three keys are in the same slot -> multi-key operations possible
10. Redis in Practice
10.1 Session Store
import redis
import json
import uuid
r = redis.Redis(decode_responses=True)
def create_session(user_id: int, data: dict) -> str:
session_id = str(uuid.uuid4())
session_data = {
"user_id": str(user_id),
"created_at": str(time.time()),
**data,
}
r.hset(f"session:{session_id}", mapping=session_data)
r.expire(f"session:{session_id}", 86400) # 24 hours
return session_id
def get_session(session_id: str) -> dict | None:
data = r.hgetall(f"session:{session_id}")
if not data:
return None
# Refresh TTL on access
r.expire(f"session:{session_id}", 86400)
return data
def destroy_session(session_id: str):
r.delete(f"session:{session_id}")
10.2 Rate Limiter (Fixed Window)
def check_rate_limit(user_id: str, limit: int = 100, window: int = 60) -> bool:
key = f"rate:{user_id}:{int(time.time()) // window}"
pipe = r.pipeline()
pipe.incr(key)
pipe.expire(key, window)
count, _ = pipe.execute()
return count <= limit
10.3 Leaderboard
class Leaderboard:
def __init__(self, name: str):
self.key = f"leaderboard:{name}"
def add_score(self, player_id: str, score: float):
r.zadd(self.key, {player_id: score})
def increment_score(self, player_id: str, delta: float):
r.zincrby(self.key, delta, player_id)
def get_rank(self, player_id: str) -> int | None:
rank = r.zrevrank(self.key, player_id)
return rank + 1 if rank is not None else None
def get_top(self, count: int = 10) -> list[tuple[str, float]]:
return r.zrevrange(self.key, 0, count - 1, withscores=True)
def get_around(self, player_id: str, count: int = 5) -> list:
rank = r.zrevrank(self.key, player_id)
if rank is None:
return []
start = max(0, rank - count)
end = rank + count
return r.zrevrange(self.key, start, end, withscores=True)
# Usage
lb = Leaderboard("weekly")
lb.add_score("player:alice", 1500)
lb.increment_score("player:alice", 200)
print(lb.get_top(10))
print(lb.get_rank("player:alice"))
10.4 Simple Job Queue
import json
import time
def enqueue_job(queue: str, job_data: dict):
job = {
"id": str(uuid.uuid4()),
"data": job_data,
"created_at": time.time(),
}
r.lpush(f"queue:{queue}", json.dumps(job))
def dequeue_job(queue: str, timeout: int = 30) -> dict | None:
result = r.brpop(f"queue:{queue}", timeout=timeout)
if result:
_, job_json = result
return json.loads(job_json)
return None
def worker(queue: str):
while True:
job = dequeue_job(queue)
if job:
try:
process_job(job["data"])
except Exception as e:
# On failure, add to retry queue
r.lpush(f"queue:{queue}:failed", json.dumps(job))
11. Redis vs Memcached vs DragonflyDB
| Feature | Redis | Memcached | DragonflyDB |
|---|---|---|---|
| Data structures | Rich (String, List, Set, etc.) | String only | Redis-compatible |
| Persistence | RDB + AOF | None | Snapshots |
| Clustering | Redis Cluster | Client-side | Single node (multithreaded) |
| Multithreading | Single thread (I/O threads) | Multithreaded | Multithreaded |
| Memory efficiency | Medium | High | High |
| Pub/Sub | Supported | Not supported | Supported |
| Lua scripting | Supported | Not supported | Supported |
| Throughput | ~100K ops/s | ~100K ops/s | ~400K ops/s |
| Max value size | 512MB | 1MB | 512MB |
| Best for | General purpose | Simple caching | High-perf single node |
12. Client Library Code Examples
12.1 Spring Data Redis (Java)
@Configuration
public class RedisConfig {
@Bean
public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory factory) {
RedisTemplate<String, Object> template = new RedisTemplate<>();
template.setConnectionFactory(factory);
template.setKeySerializer(new StringRedisSerializer());
template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
return template;
}
}
@Service
public class UserCacheService {
private final RedisTemplate<String, Object> redisTemplate;
private final Duration TTL = Duration.ofHours(1);
public UserCacheService(RedisTemplate<String, Object> redisTemplate) {
this.redisTemplate = redisTemplate;
}
public void cacheUser(String userId, UserDto user) {
String key = "user:" + userId;
redisTemplate.opsForValue().set(key, user, TTL);
}
public UserDto getCachedUser(String userId) {
String key = "user:" + userId;
return (UserDto) redisTemplate.opsForValue().get(key);
}
public void addScore(String leaderboard, String playerId, double score) {
redisTemplate.opsForZSet().add("lb:" + leaderboard, playerId, score);
}
}
12.2 ioredis (Node.js)
import Redis from 'ioredis';
const redis = new Redis({
host: 'localhost',
port: 6379,
retryStrategy: (times) => Math.min(times * 50, 2000),
maxRetriesPerRequest: 3,
});
// Basic caching
async function getUser(userId) {
const cacheKey = `user:${userId}`;
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
const user = await db.findUser(userId);
if (user) {
await redis.setex(cacheKey, 3600, JSON.stringify(user));
}
return user;
}
// Pipeline (batch processing)
async function getMultipleUsers(userIds) {
const pipeline = redis.pipeline();
userIds.forEach(id => pipeline.get(`user:${id}`));
const results = await pipeline.exec();
return results.map(([err, val]) => val ? JSON.parse(val) : null);
}
// Pub/Sub
const subscriber = new Redis();
subscriber.subscribe('notifications', (err, count) => {
console.log(`Subscribed to ${count} channels`);
});
subscriber.on('message', (channel, message) => {
console.log(`Received on ${channel}: ${message}`);
});
await redis.publish('notifications', JSON.stringify({ type: 'alert', message: 'Server update' }));
12.3 redis-py (Python)
import redis.asyncio as aioredis
import json
# Async Redis client
pool = aioredis.ConnectionPool.from_url(
"redis://localhost:6379",
max_connections=20,
decode_responses=True,
)
r = aioredis.Redis(connection_pool=pool)
# Pipeline
async def batch_operations():
async with r.pipeline(transaction=True) as pipe:
pipe.set("key1", "value1")
pipe.set("key2", "value2")
pipe.get("key1")
results = await pipe.execute()
return results
# Cache decorator
import functools
def redis_cache(ttl: int = 3600):
def decorator(func):
@functools.wraps(func)
async def wrapper(*args, **kwargs):
cache_key = f"cache:{func.__name__}:{args}:{kwargs}"
cached = await r.get(cache_key)
if cached:
return json.loads(cached)
result = await func(*args, **kwargs)
await r.setex(cache_key, ttl, json.dumps(result))
return result
return wrapper
return decorator
@redis_cache(ttl=600)
async def get_product_list(category_id: int):
return await db.get_products(category_id)
13. Interview Questions (15)
Q1. How is Redis fast despite being single-threaded?
Redis uses an event loop-based single thread with all data in memory, eliminating disk I/O. It uses epoll/kqueue-based I/O multiplexing to efficiently handle thousands of connections. Since Redis 6.0, I/O threads parallelize network processing.
Q2. What is the difference between Cache-Aside and Write-Through?
Cache-Aside has the application manage the cache directly (on read miss, query DB then cache). Write-Through updates both cache and DB simultaneously on writes. Cache-Aside is simpler to implement; Write-Through provides stronger data consistency.
Q3. Explain Redis persistence mechanisms (RDB, AOF).
RDB (Redis Database) saves point-in-time snapshots to disk. AOF (Append Only File) logs every write operation. RDB is suitable for fast recovery; AOF minimizes data loss. Using both is recommended.
Q4. Why might MGET fail in Redis Cluster?
In Redis Cluster, keys are distributed across hash slots. If MGET keys are in different slots, they cannot be processed in a single command. Use hash tags (curly braces) to place keys in the same slot, or split into multiple requests at the client.
Q5. What is the Thundering Herd problem?
When a cache key expires, many requests simultaneously hit the DB. Solutions: distributed lock allowing only one request to access DB, random jitter added to TTL, proactive background cache refresh.
Q6. What is the difference between Redis Pub/Sub and Streams?
Pub/Sub is fire-and-forget; messages are lost if no subscribers exist. Streams persist messages permanently, with Consumer Groups ensuring reliable processing. Streams support ACK, reprocessing, and history queries.
Q7. Why are big keys problematic in Redis?
Big keys block Redis during deletion, consume network bandwidth, and cause data skew in clusters. Use UNLINK for async deletion and split large hashes into smaller ones.
Q8. Explain Redis eviction policies.
When maxmemory is reached, keys are removed by eviction policy: noeviction (reject new writes), allkeys-lru (LRU), allkeys-lfu (LFU), volatile-lru (LRU among TTL keys), volatile-ttl (nearest expiration first).
Q9. Why use Lua scripting in Redis?
For atomic execution of multiple commands. Reduces network round-trips and executes complex logic atomically server-side. Common use cases: rate limiters, distributed lock release, conditional updates.
Q10. What is the difference between Redis Sentinel and Redis Cluster?
Sentinel provides high availability for master-slave setups (automatic failover). Cluster provides horizontal scaling by distributing data across nodes. Sentinel for small scale; Cluster for large datasets.
Q11. What is Redis pipelining?
Sending multiple commands at once to the server and receiving all responses together. Dramatically reduces network round-trips. 100 individual commands require 100 round-trips; pipelining needs just 1.
Q12. How does Redis handle key expiration?
Two mechanisms combined: lazy expiration checks expiry on key access; active expiration samples random expired keys every 100ms for deletion. This combination prevents expired keys from consuming excessive memory.
Q13. What is the Redlock algorithm?
A distributed lock algorithm proposed by Salvatore Sanfilippo (and critiqued by Martin Kleppmann). Lock is valid when successfully acquired on a majority (N/2+1) of N independent Redis instances. Safer than single-instance locks but not perfect.
Q14. What is Redis slow log?
Records commands exceeding a time threshold. Set threshold with slowlog-log-slower-than (microseconds). Use SLOWLOG GET to identify and optimize slow commands.
Q15. What are the pros and cons of Redis as a session store?
Pros: fast reads/writes, automatic TTL expiry, horizontal scaling, cross-server session sharing. Cons: memory cost, session loss risk on Redis failure, network dependency. Enable persistence (AOF) and replication to mitigate risks.
14. Quiz
Q1. What is the difference between Redis SET NX and XX options?
NX (Not eXists) sets the key only if it does not exist. Used for distributed lock acquisition. XX (eXists) updates only if the key already exists. Used for safely updating existing values.
Q2. What is the time complexity of ZADD and why?
ZADD has O(log N) time complexity. Sorted Set internally uses a Skip List to maintain sorted order, and Skip List insertion is O(log N).
Q3. Why should KEYS command not be used in production?
KEYS iterates all keys with O(N) complexity, blocking Redis for extended periods with many keys. Use SCAN instead, which iterates incrementally with cursor-based approach, preventing blocking.
Q4. What problem does WATCH/MULTI/EXEC solve in Redis?
It implements optimistic locking. WATCH monitors keys, MULTI starts a transaction, EXEC executes it. If a WATCHed key is modified by another client, the transaction fails. Efficient in low-contention scenarios.
Q5. HyperLogLog has ~0.81% error rate. Why use an inaccurate data structure?
Exact unique counts require O(N) memory (storing all elements). HyperLogLog uses fixed 12KB regardless of set size. Tracking 100 million unique visitors would need several GB with Set, but only 12KB with HyperLogLog. The 0.81% error is acceptable for most analytics use cases.
References
- Redis Official Documentation
- Redis University
- Redis Stack Documentation
- Valkey Project
- Redis in Action (Manning)
- ioredis GitHub
- redis-py Documentation
- Spring Data Redis
- Redis Best Practices
- Redlock Algorithm
- Martin Kleppmann's Redlock Analysis
- Redis Cluster Tutorial
- DragonflyDB
- Redis Streams Guide
- RediSearch Documentation