Split View: Redis 프로덕션 운영 완전 가이드 2025: 클러스터, 영속화, 메모리 관리, Redis 7+ 기능

Redis 프로덕션 운영 완전 가이드 2025: 클러스터, 영속화, 메모리 관리, Redis 7+ 기능

TL;DR

Redis는 단순 캐시가 아니다: 8가지 데이터 구조 (String, List, Hash, Set, Sorted Set, Stream, Bitmap, HyperLogLog) + 모듈로 시계열, 검색, JSON 처리
영속화 선택: RDB(스냅샷, 빠름, 데이터 손실 가능) vs AOF(로그, 안전, 느림). 프로덕션은 보통 둘 다 활성화
Cluster vs Sentinel: Sentinel은 고가용성만, Cluster는 샤딩 + HA. 단일 데이터셋이 16GB 미만이면 Sentinel로 충분
메모리 관리: maxmemory + maxmemory-policy 필수 설정. OOM 방지의 핵심
2024 라이선스 변경: Redis 7.4부터 SSPL/RSAL → Valkey fork 등장 (Linux Foundation, BSD)

1. Redis가 여전히 표준인 이유

1.1 핵심 강점

단일 스레드 모델 — 락 없이 원자성 보장
인메모리 + 영속화 — 빠른 속도 + 내구성
풍부한 데이터 구조 — Memcached보다 훨씬 다양
Pub/Sub + Stream — 메시지 큐 기능
Lua 스크립트 — 복잡한 원자 연산
모듈 시스템 — RedisJSON, RediSearch, RedisGraph 등

1.2 2024 라이선스 변경과 Valkey 탄생

2024년 3월, Redis Inc.가 라이선스를 BSD에서 SSPL/RSAL로 변경했습니다. 이는 클라우드 제공자(AWS, Google)가 Redis를 매니지드 서비스로 제공하는 것을 제한하는 의도였습니다.

결과: Linux Foundation이 Redis 7.2.4를 fork하여 Valkey를 만들었습니다. AWS, Google, Oracle, Ericsson 등이 후원합니다.

	Redis (현재)	Valkey
라이선스	SSPL/RSAL	BSD-3
후원	Redis Inc.	Linux Foundation
호환성	원본	100% Redis 7.2.4 호환
최신 기능	Redis Stack 모듈	점진적 추가 중

선택 가이드:

상용 모듈 필요 (RediSearch, RedisJSON 등): Redis
순수 오픈소스: Valkey
AWS ElastiCache: 곧 Valkey로 전환

2. 데이터 구조 마스터하기

2.1 String (가장 기본)

SET user:1000:name "Alice"
SET counter 0
INCR counter
INCRBY counter 10
SETEX session:abc123 3600 "data"  # TTL 1시간

활용: 캐싱, 카운터, 세션, 분산 락(SET key value NX EX 10)

2.2 Hash (객체 표현)

HSET user:1000 name "Alice" age 30 email "alice@example.com"
HGET user:1000 name
HGETALL user:1000
HINCRBY user:1000 age 1

활용: 사용자 프로필, 설정, 메모리 효율적 객체 저장

2.3 List (FIFO/LIFO)

LPUSH queue:tasks "task1"
RPUSH queue:tasks "task2"
LPOP queue:tasks
BLPOP queue:tasks 0  # 블로킹 (메시지 큐)

활용: 메시지 큐, 작업 큐, 최근 항목 목록

2.4 Set (순서 없음, 중복 없음)

SADD tags:post:1 "redis" "database" "cache"
SISMEMBER tags:post:1 "redis"
SINTER tags:post:1 tags:post:2  # 교집합
SUNION tags:post:1 tags:post:2  # 합집합

활용: 태그, 친구 목록, 중복 제거

2.5 Sorted Set (순위)

ZADD leaderboard 100 "Alice" 95 "Bob" 87 "Charlie"
ZRANGE leaderboard 0 9 WITHSCORES  # 상위 10명
ZRANGEBYSCORE leaderboard 80 100   # 80-100점
ZINCRBY leaderboard 5 "Alice"

활용: 리더보드, 우선순위 큐, 시간 정렬 데이터

2.6 Stream (이벤트 스트리밍)

XADD mystream * sensor temp 25.5
XREAD COUNT 10 STREAMS mystream 0
XGROUP CREATE mystream consumer-group $
XREADGROUP GROUP consumer-group consumer1 COUNT 10 STREAMS mystream >

활용: Kafka 같은 이벤트 스트리밍 (소규모), IoT 센서 데이터, 로그

2.7 Bitmap

SETBIT user:active:2025-04-15 1000 1  # 사용자 1000번이 오늘 활성
BITCOUNT user:active:2025-04-15

활용: 일일 활성 사용자(DAU), 출석 체크, 비트 플래그

2.8 HyperLogLog (확률적 카운팅)

PFADD visitors:2025-04-15 "user1" "user2" "user3"
PFCOUNT visitors:2025-04-15  # 약 0.81% 오차로 카운트

활용: 유니크 방문자 카운트 (메모리 12KB로 수십억 항목)

3. 영속화 (Persistence)

3.1 RDB (Redis Database)

원리: 주기적 스냅샷을 디스크에 저장.

# redis.conf
save 900 1      # 900초 안에 1개 이상 변경 시 저장
save 300 10
save 60 10000

dbfilename dump.rdb
dir /var/lib/redis

장점	단점
빠른 시작	마지막 스냅샷 이후 데이터 손실
작은 파일	큰 데이터셋은 fork() 비용
백업 친화적	실시간 보장 X

3.2 AOF (Append-Only File)

원리: 모든 쓰기 명령을 로그로 기록.

appendonly yes
appendfsync everysec  # everysec | always | no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

`appendfsync`	설명	데이터 손실
`always`	매 쓰기마다 fsync	0
`everysec`	1초마다 fsync (권장)	최대 1초
`no`	OS에 맡김	30초+

장점	단점
데이터 손실 최소	파일 크기 큼
사람이 읽을 수 있음	복구 느림
AOF rewrite로 압축	쓰기 성능 약간 저하

3.3 RDB + AOF (권장)

프로덕션 모범 사례: 둘 다 활성화.

AOF로 데이터 손실 최소화
RDB로 빠른 백업/복원

save 900 1
appendonly yes
appendfsync everysec

복구 시 Redis는 AOF를 우선합니다 (더 최신).

4. Sentinel vs Cluster

4.1 Sentinel (고가용성만)

        ┌──────────┐
        │ Sentinel │
        │ Sentinel │  ← 3+ 노드 (홀수)
        │ Sentinel │
        └────┬─────┘
             │ 모니터링
   ┌─────────┼─────────┐
   │         │         │
┌──▼──┐  ┌──▼──┐  ┌──▼──┐
│ Master  │Replica│Replica│
└─────┘   └─────┘  └─────┘

작동:

Sentinel이 Master 모니터링
Master 다운 → Sentinel들이 합의하여 Replica를 새 Master로 승격
클라이언트는 Sentinel에 새 Master 주소 질의

장점: 단순함. 단일 데이터셋 16GB 이하면 충분.

4.2 Cluster (샤딩 + HA)

[Slot 0-5460]    [Slot 5461-10922]   [Slot 10923-16383]
   Master 1          Master 2            Master 3
      │                  │                   │
   Replica           Replica             Replica

작동:

16,384 해시 슬롯을 마스터들에 분산
키 → CRC16(key) mod 16384 → 해당 슬롯의 마스터로 라우팅
마스터 장애 시 Replica가 자동 승격

장점: 수평 확장. 데이터셋이 16GB 이상이거나 처리량이 부족할 때.

제약:

다중 키 명령(MGET, MSET)은 같은 슬롯 키만 가능 ({tag} 사용)
트랜잭션과 Lua 스크립트도 단일 슬롯
모듈 일부 미지원

4.3 Cluster 키 태깅

# 같은 슬롯에 강제 배치
SET {user1000}:name "Alice"
SET {user1000}:email "alice@example.com"
MGET {user1000}:name {user1000}:email  # 작동! (같은 슬롯)

5. 메모리 관리

5.1 maxmemory 설정 (필수!)

maxmemory 4gb
maxmemory-policy allkeys-lru

maxmemory-policy 옵션:

정책	설명	사용 사례
`noeviction`	메모리 가득 차면 쓰기 거부 (기본)	DB로 사용
`allkeys-lru`	모든 키 중 LRU 제거	캐시 (가장 일반적)
`allkeys-lfu`	모든 키 중 LFU 제거	자주 접근 패턴
`volatile-lru`	TTL 있는 키만 LRU	혼합 사용
`allkeys-random`	랜덤 제거	단순
`volatile-ttl`	TTL 짧은 것부터	임시 데이터

5.2 메모리 분석

INFO memory
MEMORY USAGE mykey
MEMORY STATS

# 큰 키 찾기
redis-cli --bigkeys

# 키 패턴별 메모리
redis-cli --memkeys

5.3 메모리 절약 팁

Hash 활용 — 작은 객체는 String보다 Hash가 효율적
압축 가능한 인코딩 — hash-max-ziplist-entries, list-max-ziplist-size
TTL 설정 — 임시 데이터는 반드시 EXPIRE
HyperLogLog — 정확도 손실을 받아들이면 메모리 절감

6. 성능 튜닝

6.1 클라이언트 튜닝

# Pipelining 활용
maxclients 10000
timeout 0
tcp-keepalive 60
tcp-backlog 511

파이프라이닝: 여러 명령을 한 번에 전송 → 네트워크 왕복 시간 절감.

# 일반: 1000 RTT
for i in range(1000):
    r.set(f"key:{i}", i)

# 파이프라인: 1 RTT
pipe = r.pipeline()
for i in range(1000):
    pipe.set(f"key:{i}", i)
pipe.execute()
# 100배 빠름

6.2 슬로우 로그

CONFIG SET slowlog-log-slower-than 10000  # 10ms 이상
SLOWLOG GET 10
SLOWLOG RESET

6.3 ACL (Redis 6+)

ACL SETUSER cacheuser on >mypassword ~cache:* +get +set +del
ACL LIST

원칙: 애플리케이션마다 별도 사용자, 최소 권한.

7. Redis 7+ 새 기능

7.1 Functions (Redis 7)

Lua 스크립트의 후계자. 라이브러리로 함수 등록 → 영속화.

FUNCTION LOAD "#!lua name=mylib
redis.register_function('hello', function() return 'world' end)
"
FCALL hello 0

7.2 Sharded Pub/Sub (Redis 7)

Cluster 모드에서 Pub/Sub이 전체 노드에 브로드캐스트되어 비효율적이었습니다. Sharded Pub/Sub은 채널이 슬롯에 매핑되어 해당 노드에만 전송.

SPUBLISH mychannel "message"
SSUBSCRIBE mychannel

7.3 Multi-Part AOF (Redis 7)

AOF가 RDB + 증분 AOF 두 파일로 분리되어 rewrite가 빨라짐.

7.4 Client-Side Caching (Redis 6)

서버가 클라이언트에게 무효화 메시지를 보내 클라이언트 캐시를 활용. 응답 시간 대폭 감소.

8. 모니터링과 디버깅

8.1 핵심 메트릭

INFO stats | grep instantaneous_ops_per_sec
INFO clients | grep connected_clients
INFO memory | grep used_memory_human
INFO replication
INFO persistence

대시보드 도구:

RedisInsight (공식 GUI)
Grafana + redis_exporter (Prometheus 통합)
Datadog Redis Integration

8.2 latency 측정

redis-cli --latency
redis-cli --latency-history
redis-cli --latency-dist

8.3 흔한 문제

증상	원인	해결
갑자기 느려짐	큰 키 (`KEYS *`, 큰 Hash)	`--bigkeys`로 발견, `SCAN` 사용
OOM	maxmemory 미설정	maxmemory + policy 설정
Master fork 실패	메모리 부족	overcommit_memory=1
Replica lag	네트워크/디스크	repl-diskless-sync 활성화

9. Redis vs 대안

9.1 비교표

	Redis	Memcached	Valkey	KeyDB	DragonflyDB
데이터 구조	8+	String	8+	8+	8+
영속화	RDB/AOF	❌	RDB/AOF	RDB/AOF	RDB
클러스터	✅	클라이언트	✅	✅	단일 노드
라이선스	SSPL/RSAL	BSD	BSD	BSD	BSL
멀티스레드	❌ (7+ I/O 분리)	✅	❌	✅	✅
메모리 효율	표준	우수	표준	표준	2-3x 효율

9.2 DragonflyDB — 새로운 강자

C++로 처음부터 작성
Redis API 100% 호환
단일 노드에서 100만 QPS+
메모리 효율 2-3배
단점: 클러스터 없음 (단일 머신용)

퀴즈

1. RDB와 AOF 중 무엇을 선택해야 하나요?

답: 둘 다 활성화가 정답입니다. AOF는 데이터 손실을 최소화하고(everysec로 최대 1초 손실), RDB는 빠른 백업/복원과 디스크 효율을 제공합니다. Redis는 시작 시 AOF를 우선 로드합니다(더 최신). RDB만 사용하면 마지막 스냅샷 이후 데이터 손실, AOF만 사용하면 복구 시간이 길어집니다.

2. Sentinel 대신 Cluster를 선택해야 하는 시점은?

답: (1) 데이터셋이 단일 머신 RAM(16GB+)을 초과할 때, (2) 단일 노드 처리량(약 100,000 QPS)이 부족할 때, (3) 수평 확장이 필요할 때. 그 외에는 Sentinel이 더 단순하고 안정적입니다. Cluster는 다중 키 연산 제약(같은 슬롯), 모듈 호환성 등 추가 복잡도가 있습니다.

3. allkeys-lru와 volatile-lru의 차이는?

답: allkeys-lru는 모든 키에서 LRU 기반 제거, volatile-lru는 TTL이 설정된 키만 제거합니다. 순수 캐시 용도라면 allkeys-lru가 적합 (모든 키가 캐시). 캐시 + 영구 데이터 혼용이면 volatile-lru (영구 데이터는 보호). 잘못 선택하면 영구 데이터가 사라지거나 메모리 폭증 발생.

4. 큰 키(big key)가 위험한 이유는?

답: Redis는 단일 스레드이므로 큰 키 작업이 전체 서버를 블로킹합니다. 100MB Hash의 HGETALL, 백만 항목 Set의 SMEMBERS는 수 초 이상 걸려 다른 모든 명령을 대기시킵니다. 해결: --bigkeys로 발견, SCAN/HSCAN/SSCAN으로 점진적 순회, 큰 데이터 구조를 여러 키로 분할.

5. Valkey가 등장한 배경은?

답: 2024년 3월 Redis Inc.가 라이선스를 BSD에서 SSPL/RSAL로 변경. 이는 AWS/Google 같은 클라우드 제공자가 매니지드 Redis를 제공하는 것을 제한하는 의도였습니다. 결과: Linux Foundation이 Redis 7.2.4를 fork하여 Valkey(BSD-3) 출시. AWS ElastiCache가 곧 Valkey로 전환할 예정입니다. 순수 오픈소스를 원하면 Valkey, 상용 모듈이 필요하면 Redis 선택.

참고 자료

Redis Production Operations Complete Guide 2025: Cluster, Persistence, Memory, Redis 7+

TL;DR

Redis is not just a cache: 8 data structures (String, List, Hash, Set, Sorted Set, Stream, Bitmap, HyperLogLog) plus modules for time-series, search, JSON
Persistence choice: RDB (snapshot, fast, possible data loss) vs AOF (log, safe, slower). Production usually enables both
Cluster vs Sentinel: Sentinel for HA only, Cluster for sharding + HA. Single dataset under 16GB? Sentinel is enough
Memory management: maxmemory + maxmemory-policy are mandatory. Key to OOM prevention
2024 license change: Redis 7.4 moved to SSPL/RSAL → Valkey fork emerged (Linux Foundation, BSD)

1. Why Redis Is Still the Standard

1.1 Core Strengths

Single-threaded model — Atomicity guaranteed without locks
In-memory + persistence — Speed + durability
Rich data structures — Far more than Memcached
Pub/Sub + Stream — Message queue functionality
Lua scripts — Complex atomic operations
Module system — RedisJSON, RediSearch, RedisGraph

1.2 2024 License Change and Valkey

In March 2024, Redis Inc. changed the license from BSD to SSPL/RSAL, intending to restrict cloud providers (AWS, Google) from offering Redis as a managed service.

Result: Linux Foundation forked Redis 7.2.4 to create Valkey. Sponsored by AWS, Google, Oracle, Ericsson.

	Redis (current)	Valkey
License	SSPL/RSAL	BSD-3
Sponsor	Redis Inc.	Linux Foundation
Compatibility	Original	100% Redis 7.2.4 compatible
Latest features	Redis Stack modules	Gradual additions

2. Master the Data Structures

2.1 String (Most Basic)

SET user:1000:name "Alice"
SET counter 0
INCR counter
SETEX session:abc123 3600 "data"  # TTL 1 hour

Use cases: Caching, counters, sessions, distributed locks (SET key value NX EX 10)

2.2 Hash (Object Representation)

HSET user:1000 name "Alice" age 30 email "alice@example.com"
HGET user:1000 name
HGETALL user:1000
HINCRBY user:1000 age 1

Use cases: User profiles, settings, memory-efficient object storage

2.3 List (FIFO/LIFO)

LPUSH queue:tasks "task1"
RPUSH queue:tasks "task2"
LPOP queue:tasks
BLPOP queue:tasks 0  # Blocking (message queue)

2.4 Set (Unordered, Unique)

SADD tags:post:1 "redis" "database" "cache"
SISMEMBER tags:post:1 "redis"
SINTER tags:post:1 tags:post:2  # Intersection

2.5 Sorted Set (Ranking)

ZADD leaderboard 100 "Alice" 95 "Bob" 87 "Charlie"
ZRANGE leaderboard 0 9 WITHSCORES  # Top 10
ZRANGEBYSCORE leaderboard 80 100

2.6 Stream (Event Streaming)

XADD mystream * sensor temp 25.5
XGROUP CREATE mystream consumer-group $
XREADGROUP GROUP consumer-group consumer1 COUNT 10 STREAMS mystream >

2.7 Bitmap

SETBIT user:active:2025-04-15 1000 1
BITCOUNT user:active:2025-04-15

2.8 HyperLogLog (Probabilistic Counting)

PFADD visitors:2025-04-15 "user1" "user2" "user3"
PFCOUNT visitors:2025-04-15  # ~0.81% error

Use cases: Unique visitor count (12KB memory for billions of items)

3. Persistence

3.1 RDB (Redis Database)

How: Periodic snapshots to disk.

save 900 1
save 300 10
save 60 10000
dbfilename dump.rdb

Pros	Cons
Fast startup	Data loss since last snapshot
Small files	Fork() cost for large datasets
Backup-friendly	No real-time guarantee

3.2 AOF (Append-Only File)

How: Logs every write command.

appendonly yes
appendfsync everysec
auto-aof-rewrite-percentage 100

`appendfsync`	Description	Data Loss
`always`	fsync per write	0
`everysec`	fsync per second (recommended)	Max 1 sec
`no`	OS decides	30+ sec

3.3 RDB + AOF (Recommended)

Production best practice: Enable both.

AOF minimizes data loss
RDB provides fast backup/restore

Redis loads AOF first during recovery (more recent).

4. Sentinel vs Cluster

4.1 Sentinel (HA Only)

How it works:

Sentinel monitors Master
Master down → Sentinels reach consensus to promote a Replica
Clients query Sentinel for new Master address

Pros: Simple. Sufficient for datasets under 16GB.

4.2 Cluster (Sharding + HA)

How it works:

16,384 hash slots distributed among masters
Key → CRC16(key) mod 16384 → routed to slot's master
Master failure → Replica auto-promoted

Pros: Horizontal scaling. For datasets over 16GB or insufficient throughput.

Constraints:

Multi-key commands (MGET, MSET) only work on same-slot keys (use {tag})
Transactions and Lua scripts also single-slot
Some modules unsupported

4.3 Cluster Key Tagging

SET {user1000}:name "Alice"
SET {user1000}:email "alice@example.com"
MGET {user1000}:name {user1000}:email  # Works! (same slot)

5. Memory Management

5.1 maxmemory Settings (Mandatory!)

maxmemory 4gb
maxmemory-policy allkeys-lru

maxmemory-policy options:

Policy	Description	Use Case
`noeviction`	Reject writes when full (default)	Used as DB
`allkeys-lru`	LRU evict from all keys	Cache (most common)
`allkeys-lfu`	LFU evict from all keys	Frequent access patterns
`volatile-lru`	LRU for TTL keys only	Mixed use
`allkeys-random`	Random eviction	Simple
`volatile-ttl`	Shortest TTL first	Temporary data

5.2 Memory Analysis

INFO memory
MEMORY USAGE mykey
redis-cli --bigkeys

5.3 Memory-Saving Tips

Use Hash — Small objects more efficient as Hash than String
Compressible encoding — hash-max-ziplist-entries
Set TTL — Always EXPIRE temporary data
HyperLogLog — Accept accuracy loss for memory savings

6. Performance Tuning

6.1 Pipelining

# Normal: 1000 RTT
for i in range(1000):
    r.set(f"key:{i}", i)

# Pipeline: 1 RTT
pipe = r.pipeline()
for i in range(1000):
    pipe.set(f"key:{i}", i)
pipe.execute()
# 100x faster

6.2 Slow Log

CONFIG SET slowlog-log-slower-than 10000  # 10ms+
SLOWLOG GET 10

6.3 ACL (Redis 6+)

ACL SETUSER cacheuser on >mypassword ~cache:* +get +set +del

Principle: Separate user per app, least privilege.

7. Redis 7+ New Features

7.1 Functions (Redis 7)

Successor to Lua scripts. Register functions as libraries → persisted.

FUNCTION LOAD "#!lua name=mylib
redis.register_function('hello', function() return 'world' end)
"
FCALL hello 0

7.2 Sharded Pub/Sub (Redis 7)

In Cluster mode, Pub/Sub broadcast to all nodes was inefficient. Sharded Pub/Sub maps channels to slots, sending only to that node.

7.3 Multi-Part AOF (Redis 7)

AOF split into RDB + incremental AOF for faster rewrites.

7.4 Client-Side Caching (Redis 6)

Server sends invalidation messages to leverage client-side cache. Significant response time reduction.

8. Monitoring and Debugging

8.1 Key Metrics

INFO stats | grep instantaneous_ops_per_sec
INFO clients | grep connected_clients
INFO memory | grep used_memory_human
INFO replication
INFO persistence

Dashboards:

RedisInsight (official GUI)
Grafana + redis_exporter
Datadog Redis Integration

8.2 Latency Measurement

redis-cli --latency
redis-cli --latency-history
redis-cli --latency-dist

8.3 Common Issues

Symptom	Cause	Solution
Sudden slowdown	Big keys (`KEYS *`, large Hash)	Use `--bigkeys`, `SCAN`
OOM	maxmemory not set	Set maxmemory + policy
Master fork failure	Insufficient memory	overcommit_memory=1
Replica lag	Network/disk	Enable repl-diskless-sync

9. Redis vs Alternatives

9.1 Comparison

	Redis	Memcached	Valkey	KeyDB	DragonflyDB
Data structures	8+	String	8+	8+	8+
Persistence	RDB/AOF	❌	RDB/AOF	RDB/AOF	RDB
Cluster	✅	Client-side	✅	✅	Single-node
License	SSPL/RSAL	BSD	BSD	BSD	BSL
Multi-thread	❌ (7+ I/O only)	✅	❌	✅	✅
Memory efficiency	Standard	Excellent	Standard	Standard	2-3x better

9.2 DragonflyDB — New Contender

Written from scratch in C++
100% Redis API compatible
1M+ QPS on a single node
2-3x memory efficiency
Downside: No cluster (single-machine)

Quiz

1. RDB or AOF — which to choose?

Answer: Both enabled is the answer. AOF minimizes data loss (max 1 sec with everysec), RDB provides fast backup/restore and disk efficiency. Redis loads AOF first on startup (more recent). Using only RDB causes data loss since last snapshot; only AOF means slow recovery.

2. When to choose Cluster over Sentinel?

Answer: (1) Dataset exceeds single machine RAM (16GB+), (2) Single node throughput (~100K QPS) insufficient, (3) Horizontal scaling needed. Otherwise Sentinel is simpler and more stable. Cluster has additional complexity: multi-key constraints (same slot), module compatibility issues.

3. Difference between allkeys-lru and volatile-lru?

Answer: allkeys-lru evicts from all keys by LRU; volatile-lru evicts only from TTL keys. Pure cache use → allkeys-lru (all keys are cache). Cache + permanent data mix → volatile-lru (protects permanent data). Wrong choice causes data loss or memory explosion.

4. Why are big keys dangerous?

Answer: Redis is single-threaded, so big key operations block the entire server. HGETALL on a 100MB Hash or SMEMBERS on a million-item Set takes seconds, blocking all other commands. Solution: Find with --bigkeys, use SCAN/HSCAN/SSCAN for incremental iteration, split large structures into multiple keys.

5. Background of Valkey emergence?

Answer: March 2024: Redis Inc. changed license from BSD to SSPL/RSAL, intending to restrict cloud providers (AWS/Google) from offering managed Redis. Result: Linux Foundation forked Redis 7.2.4 as Valkey (BSD-3). AWS ElastiCache will soon transition to Valkey. Choose Valkey for pure open source, Redis for commercial modules.