Skip to content
Published on

Redis Cluster Architecture and High Availability Operations Guide: Sentinel, Cluster Mode, Memory Optimization, and Disaster Recovery

Authors
  • Name
    Twitter
Redis Cluster Architecture

Introduction

Redis is an in-memory data store used for various purposes including caching, session storage, and message brokering. While starting with a single instance is straightforward, achieving high availability (HA) and horizontal scaling in production environments requires a thorough understanding of either Sentinel or Cluster mode. As a memory-based system, comprehensive knowledge spanning memory management, eviction policies, persistence strategies, and disaster recovery procedures is essential.

This guide starts by comparing Redis's three deployment modes (Standalone, Sentinel, Cluster) and then covers Sentinel's quorum mechanism, Cluster hash slot distribution, replication protocol (PSYNC), memory optimization strategies, persistence (RDB/AOF), slow log analysis, and common failure scenarios (Split-brain, OOM) with recovery procedures.

Redis Deployment Modes: Standalone vs Sentinel vs Cluster

Deployment Mode Comparison

AspectStandaloneSentinelCluster
Node Count1 (+ optional replicas)Min 3 Sentinel + 1 Master + N ReplicaMin 6 (3 Master + 3 Replica)
Data DistributionNoneNone (single master)Automatic via hash slots
Automatic FailoverNoYes (quorum-based)Yes (majority vote)
Write ScalingNoNoYes (multi-master)
Read ScalingReplica READONLYReplica READONLYReplica READONLY
Client ComplexityLowMedium (Sentinel-aware)High (MOVED/ASK redirection)
Best ForDev/Small-scaleHA for single datasetLarge-scale data + HA

Decision Criteria

# Decision flow
# 1. Does the data fit in a single node's memory?
#    - YES → Sentinel (if HA needed) or Standalone
#    - NO  → Cluster required
#
# 2. Do you need write performance scaling?
#    - YES → Cluster (multi-master)
#    - NO  → Sentinel is sufficient
#
# 3. Can you handle the operational complexity?
#    - Small team → Consider Sentinel first
#    - Dedicated infra team → Cluster is viable

Sentinel Architecture and Quorum Mechanism

Role of Sentinel

Redis Sentinel is a distributed monitoring system that watches Redis instances and automatically promotes a replica to master when a primary failure is detected. Sentinel processes themselves are distributed to prevent single points of failure (SPOF).

# Sentinel configuration file (sentinel.conf)
port 26379
sentinel monitor mymaster 192.168.1.10 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1

# sentinel monitor <master-name> <ip> <port> <quorum>
# quorum = minimum number of Sentinel agreements for failure detection

Quorum vs Majority

Quorum and majority are distinct concepts. Quorum is the minimum number of agreements needed to detect a failure, while majority is the requirement for actually executing failover.

# 3 Sentinel deployment example
# quorum = 2 (2 agreements trigger ODOWN)
# majority = 2 (2 out of 3 = majority)

# 5 Sentinel deployment example
# quorum = 3 (3 agreements trigger ODOWN)
# majority = 3 (3 out of 5 = majority)

# Check Sentinel status
redis-cli -p 26379 SENTINEL masters
redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster
redis-cli -p 26379 SENTINEL replicas mymaster

Failover Sequence

# 1. SDOWN (Subjective Down) - Individual Sentinel detects master unresponsive
#    Triggered after down-after-milliseconds elapsed

# 2. ODOWN (Objective Down) - Quorum or more Sentinels agree on SDOWN
#    Failover process begins at this point

# 3. Leader Election - One Sentinel elected as leader
#    Requires majority votes (Raft-like algorithm)

# 4. Replica Selection - Leader Sentinel selects the best replica
#    Priority: replica-priority → replication offset → runid

# 5. Failover Execution
#    SLAVEOF NO ONE command to selected replica
#    Redirect other replicas to new master

# Monitor failover progress
redis-cli -p 26379 SENTINEL failover-status mymaster

Cluster Hash Slots and Resharding

Hash Slot Distribution

Redis Cluster divides the entire keyspace into 16384 hash slots. Each master node is responsible for a subset of slots, and a key's slot is determined by computing CRC16(key) mod 16384.

# Create cluster (minimum 6 nodes: 3 Master + 3 Replica)
redis-cli --cluster create \
  192.168.1.10:7000 192.168.1.11:7001 192.168.1.12:7002 \
  192.168.1.10:7003 192.168.1.11:7004 192.168.1.12:7005 \
  --cluster-replicas 1

# Check slot distribution
redis-cli -c -h 192.168.1.10 -p 7000 CLUSTER SLOTS

# Check which slot a key belongs to
redis-cli -c CLUSTER KEYSLOT mykey
# (integer) 14687

# Use hash tags to place keys in the same slot
# Only the string inside curly braces determines the slot
redis-cli -c SET "user:1001:profile" "data1"
redis-cli -c SET "user:1001:session" "data2"
# These two keys may end up in different slots

# With hash tags, they can be co-located
# Verify with CLUSTER KEYSLOT
redis-cli CLUSTER KEYSLOT "user:1001"

Resharding

# Online resharding - move slots without service interruption
redis-cli --cluster reshard 192.168.1.10:7000 \
  --cluster-from <source-node-id> \
  --cluster-to <target-node-id> \
  --cluster-slots 1000 \
  --cluster-yes

# Check resharding progress
redis-cli -c -h 192.168.1.10 -p 7000 CLUSTER INFO

# Cluster health check
redis-cli --cluster check 192.168.1.10:7000

# Slot rebalancing (automatic even distribution)
redis-cli --cluster rebalance 192.168.1.10:7000 \
  --cluster-threshold 2

MOVED and ASK Redirection

# Python redis-py-cluster example
import redis

# ClusterMode client handles MOVED/ASK automatically
rc = redis.RedisCluster(
    host='192.168.1.10',
    port=7000,
    decode_responses=True,
    skip_full_coverage_check=True
)

# Normal usage - redirection handled automatically
rc.set('session:abc123', 'user_data')
value = rc.get('session:abc123')

# When using pipelines, only keys in the same slot can be batched
# Hash tags help place keys in the same slot
pipe = rc.pipeline()
pipe.set('order:1001:status', 'pending')
pipe.set('order:1001:total', '50000')
pipe.execute()

# MOVED: key lives on a different node (permanent move)
# ASK: key temporarily on a different node during resharding

Replication and PSYNC

Replication Setup and PSYNC Protocol

# Replica configuration
# redis.conf (replica node)
replicaof 192.168.1.10 6379
masterauth your_password

# PSYNC protocol behavior:
# 1. Full Sync
#    - When replica connects for the first time or backlog is insufficient
#    - Master generates and sends RDB snapshot
#    - Changes during transfer are buffered and sent afterward

# 2. Partial Sync (PSYNC2)
#    - When replica reconnects after a brief disconnection
#    - Only delta from replication backlog is transmitted
#    - Much faster and more efficient

# backlog size (default 1MB, increase for production)
repl-backlog-size 256mb
repl-backlog-ttl 3600

# Check replication status
redis-cli INFO replication

Replication Monitoring

# Check replication status on master
redis-cli INFO replication
# role:master
# connected_slaves:2
# slave0:ip=192.168.1.11,port=6379,state=online,offset=1234567,lag=0
# slave1:ip=192.168.1.12,port=6379,state=online,offset=1234567,lag=1
# master_replid:abc123def456...
# master_repl_offset:1234567
# repl_backlog_active:1
# repl_backlog_size:268435456

# Replication lag monitoring - investigate if lag is persistently high
# lag > 10 : Check network or replica performance
# lag > 60 : Urgent action needed (full sync may occur)

# Verify read-only mode on replica
redis-cli -h 192.168.1.11 CONFIG GET replica-read-only

Memory Management and Eviction Policies

maxmemory Configuration

# maxmemory setting (redis.conf)
maxmemory 8gb

# Runtime change
redis-cli CONFIG SET maxmemory 8589934592

# Check current memory usage
redis-cli INFO memory
# used_memory:4294967296
# used_memory_human:4.00G
# used_memory_rss:4831838208
# used_memory_rss_human:4.50G
# mem_fragmentation_ratio:1.12
# maxmemory:8589934592
# maxmemory_human:8.00G
# maxmemory_policy:allkeys-lru

Eviction Policy Comparison

PolicyTarget KeysAlgorithmBest For
noevictionNone (rejects writes)-No data loss tolerated
allkeys-lruAll keysLRUGeneral caching (most common)
allkeys-lfuAll keysLFUPopularity-based caching
allkeys-randomAll keysRandomUniform access patterns
volatile-lruTTL-set keys onlyLRUMixed cache + persistent data
volatile-lfuTTL-set keys onlyLFUFrequency-based among TTL keys
volatile-ttlTTL-set keys onlyShortest remaining TTLExpire-soon keys first
volatile-randomTTL-set keys onlyRandomRandom TTL key removal
# Set eviction policy
redis-cli CONFIG SET maxmemory-policy allkeys-lfu

# LFU counter configuration (Redis 4.0+)
# lfu-log-factor: counter increment speed (default 10, higher = slower)
# lfu-decay-time: counter decay interval in minutes (default 1)
redis-cli CONFIG SET lfu-log-factor 10
redis-cli CONFIG SET lfu-decay-time 1

# Check LFU frequency of a specific key
redis-cli OBJECT FREQ mykey

# Check eviction statistics
redis-cli INFO stats | grep evicted
# evicted_keys:12345

Memory Optimization Techniques

# 1. Data structure optimization - use ziplist/listpack
# Use ziplist for small hashes (up to 10x memory savings)
redis-cli CONFIG SET hash-max-ziplist-entries 128
redis-cli CONFIG SET hash-max-ziplist-value 64

# Use listpack for small lists (Redis 7.0+)
redis-cli CONFIG SET list-max-listpack-size -2

# Use ziplist for small Sorted Sets
redis-cli CONFIG SET zset-max-ziplist-entries 128
redis-cli CONFIG SET zset-max-ziplist-value 64

# 2. Key naming optimization - use short key names
# Bad:  user:session:authentication:token:1001
# Good: u:s:t:1001

# 3. Memory usage analysis
redis-cli MEMORY USAGE mykey
redis-cli MEMORY DOCTOR

# 4. Big key detection
redis-cli --bigkeys --memkeys

# 5. Lazy Free configuration (background deletion to avoid blocking)
redis-cli CONFIG SET lazyfree-lazy-eviction yes
redis-cli CONFIG SET lazyfree-lazy-expire yes
redis-cli CONFIG SET lazyfree-lazy-server-del yes

Persistence: RDB and AOF

RDB vs AOF Comparison

AspectRDB (Snapshot)AOF (Append Only File)
MechanismPoint-in-time full snapshotSequential recording of all write commands
Data LossPossible since last snapshotMinimized based on fsync settings
File SizeSmall (binary compressed)Large (command text, reduced by rewrite)
Recovery SpeedFastSlow (command replay)
Performance ImpactMomentary latency during fork()Continuous impact based on fsync frequency
Recommended UseBackupData durability
# RDB configuration (redis.conf)
save 900 1      # Snapshot if 1+ changes in 900 seconds (15 min)
save 300 10     # Snapshot if 10+ changes in 300 seconds (5 min)
save 60 10000   # Snapshot if 10000+ changes in 60 seconds (1 min)
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis

# AOF configuration
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec  # Recommended: fsync every second (balance of safety and performance)
# appendfsync always  # fsync on every command (safest, slowest)
# appendfsync no      # Delegate to OS (fastest, risk of data loss)

# AOF Rewrite configuration
auto-aof-rewrite-percentage 100  # Rewrite when AOF grows 100% since last rewrite
auto-aof-rewrite-min-size 64mb   # Only rewrite when at least 64MB

# Production recommendation: use both RDB + AOF
# RDB: fast recovery + backup
# AOF: minimize data loss

AOF Rewrite and Management

# Manual AOF Rewrite
redis-cli BGREWRITEAOF

# Check AOF status
redis-cli INFO persistence
# aof_enabled:1
# aof_rewrite_in_progress:0
# aof_last_rewrite_time_sec:2
# aof_current_size:134217728
# aof_base_size:67108864

# AOF integrity check
redis-check-aof --fix appendonly.aof

# RDB integrity check
redis-check-rdb dump.rdb

# Manual backup (using BGSAVE)
redis-cli BGSAVE
# Creates RDB snapshot in background
# Copy dump.rdb to safe storage

Slow Log Analysis

# Slow log configuration
redis-cli CONFIG SET slowlog-log-slower-than 10000  # Log queries taking 10ms+
redis-cli CONFIG SET slowlog-max-len 128            # Keep up to 128 entries

# View slow log
redis-cli SLOWLOG GET 10
# 1) 1) (integer) 14          # Log ID
#    2) (integer) 1710230400   # Timestamp
#    3) (integer) 15230       # Execution time (microseconds)
#    4) 1) "KEYS"             # Command
#       2) "*session*"
#    5) "192.168.1.50:54321"  # Client address
#    6) ""

# Slow log statistics
redis-cli SLOWLOG LEN
redis-cli SLOWLOG RESET

# O(N) commands to avoid in production:
# KEYS *        → Replace with SCAN
# SMEMBERS      → Replace with SSCAN
# HGETALL       → Replace with HSCAN
# LRANGE 0 -1   → Apply pagination

Safe Key Scanning with SCAN

# Use SCAN instead of KEYS (non-blocking)
redis-cli SCAN 0 MATCH "session:*" COUNT 100
# 1) "17920"    # Next cursor
# 2) 1) "session:abc123"
#    2) "session:def456"
#    ...

# Iterate until cursor returns 0
redis-cli SCAN 17920 MATCH "session:*" COUNT 100

Memory Fragmentation Handling

# Check fragmentation ratio
redis-cli INFO memory | grep frag
# mem_fragmentation_ratio:1.45
# mem_fragmentation_bytes:536870912

# Interpreting mem_fragmentation_ratio:
# 1.0 ~ 1.5 : Normal range
# Above 1.5 : Severe fragmentation, action needed
# Below 1.0 : Swapping in use (very dangerous)

# Enable Active Defragmentation (Redis 4.0+)
redis-cli CONFIG SET activedefrag yes
redis-cli CONFIG SET active-defrag-enabled yes

# Defragmentation threshold settings
redis-cli CONFIG SET active-defrag-ignore-bytes 100mb
redis-cli CONFIG SET active-defrag-threshold-lower 10   # Start at 10%
redis-cli CONFIG SET active-defrag-threshold-upper 100  # Max effort at 100%
redis-cli CONFIG SET active-defrag-cycle-min 1           # Min 1% CPU
redis-cli CONFIG SET active-defrag-cycle-max 25          # Max 25% CPU

# Check jemalloc statistics
redis-cli MEMORY MALLOC-STATS

Failure Scenarios and Recovery Procedures

Scenario 1: Split-brain

When a network partition occurs, the Sentinel cluster can split, resulting in two simultaneous masters. This is the most dangerous failure as it causes data inconsistency.

# Split-brain prevention settings
# Master rejects writes if fewer than N replicas are connected
redis-cli CONFIG SET min-replicas-to-write 1
redis-cli CONFIG SET min-replicas-max-lag 10

# Recovery procedure when split-brain occurs:
# 1. Check status of all Redis instances
redis-cli -h 192.168.1.10 -p 6379 INFO replication
redis-cli -h 192.168.1.11 -p 6379 INFO replication

# 2. Determine which master has the latest data
redis-cli -h 192.168.1.10 -p 6379 INFO replication | grep master_repl_offset
redis-cli -h 192.168.1.11 -p 6379 INFO replication | grep master_repl_offset

# 3. Demote the stale master to replica
redis-cli -h 192.168.1.11 -p 6379 REPLICAOF 192.168.1.10 6379

# 4. Reset Sentinel state
redis-cli -p 26379 SENTINEL RESET mymaster

# 5. Verify data consistency
redis-cli -h 192.168.1.10 DBSIZE
redis-cli -h 192.168.1.11 DBSIZE

Scenario 2: OOM (Out of Memory)

# OOM prevention monitoring
redis-cli INFO memory
# used_memory_peak:8589934592
# used_memory_peak_human:8.00G

# Check if Redis was killed by OOM Killer
# Check system logs
# Look for OOM-related entries in dmesg or /var/log/syslog

# Emergency recovery procedure:
# 1. Check and adjust maxmemory
redis-cli CONFIG SET maxmemory 12gb

# 2. Change eviction policy if set to noeviction
redis-cli CONFIG GET maxmemory-policy
redis-cli CONFIG SET maxmemory-policy allkeys-lru

# 3. Identify and clean up large keys
redis-cli --bigkeys
redis-cli --memkeys --memkeys-samples 100

# 4. Set TTL on unnecessary keys
redis-cli SCAN 0 MATCH "temp:*" COUNT 1000
# Set TTL on identified keys
redis-cli EXPIRE "temp:old-data" 3600

# 5. Set up alerts for future prevention (warn at 80% of maxmemory)

Scenario 3: Cluster Node Failure

# Check cluster status
redis-cli -c CLUSTER INFO
# cluster_state:ok
# cluster_slots_assigned:16384
# cluster_slots_ok:16384
# cluster_slots_pfail:0
# cluster_slots_fail:0

# Identify failed node
redis-cli -c CLUSTER NODES
# Failed node will show "fail" flag

# Node replacement procedure:
# 1. Start new Redis instance
redis-server /etc/redis/7006.conf

# 2. Add new node to cluster
redis-cli --cluster add-node 192.168.1.13:7006 192.168.1.10:7000

# 3. Assign new node as replica of a specific master
redis-cli --cluster add-node 192.168.1.13:7006 192.168.1.10:7000 \
  --cluster-slave --cluster-master-id <master-node-id>

# 4. Remove failed node
redis-cli --cluster del-node 192.168.1.10:7000 <failed-node-id>

# 5. Reassign slots (if master failed)
redis-cli --cluster fix 192.168.1.10:7000

Operations Monitoring and Key Metrics

# Comprehensive status check script
redis-cli INFO ALL | grep -E "used_memory_human|mem_fragmentation_ratio|connected_clients|blocked_clients|instantaneous_ops_per_sec|hit_rate|evicted_keys|keyspace_misses"

# Key monitoring metrics:
# 1. Memory: used_memory / maxmemory ratio
# 2. Fragmentation: mem_fragmentation_ratio
# 3. Cache hit rate: keyspace_hits / (keyspace_hits + keyspace_misses)
# 4. Connections: connected_clients
# 5. Throughput: instantaneous_ops_per_sec
# 6. Evictions: evicted_keys (alert on spikes)
# 7. Replication lag: master_repl_offset vs slave_repl_offset

# Latency monitoring
redis-cli --latency
redis-cli --latency-history

# Latency event monitoring (Redis 2.8.13+)
redis-cli CONFIG SET latency-monitor-threshold 100
redis-cli LATENCY LATEST
redis-cli LATENCY HISTORY event-name

Operational Notes

  1. Memory Overcommit: On Linux, set vm.overcommit_memory=1. Without this, fork()-based RDB/AOF rewrite may fail due to insufficient memory.

  2. Disable Transparent Huge Pages (THP): THP negatively affects Redis fork performance and must be disabled.

  3. maxmemory Margin: Set to 70-80% of physical memory to leave headroom for replication buffers, AOF rewrite, and OS cache.

  4. MULTI/EXEC in Cluster Mode: All keys in a transaction must reside in the same slot. Use hash tags to ensure co-location.

  5. Sentinel Placement: Deploy Sentinel instances on separate physical servers or availability zones from Redis nodes to prevent simultaneous failures.

  6. Ban KEYS Command: KEYS causes O(N) blocking in production. Replace with SCAN and disable via rename-command.

# Production kernel tuning
# /etc/sysctl.conf
# vm.overcommit_memory = 1
# net.core.somaxconn = 65535
# net.ipv4.tcp_max_syn_backlog = 65535

# Disable THP
# echo never > /sys/kernel/mm/transparent_hugepage/enabled

# Disable dangerous commands (redis.conf)
# rename-command KEYS ""
# rename-command FLUSHDB ""
# rename-command FLUSHALL ""
# rename-command DEBUG ""

Conclusion

Operating Redis for high availability goes beyond simply configuring Sentinel or Cluster. Stable production operations require memory management, eviction policies, persistence strategies, replication monitoring, and preparation for failure scenarios.

The three essentials are: First, choose the right deployment mode for your workload. Second, always configure memory limits and eviction policies. Third, document recovery procedures for each failure scenario and practice them regularly. With these three pillars in place, Redis can reliably deliver its strengths as an ultra-fast in-memory data store.

References