Skip to content
Published on

etcd Watch and Lease Mechanism Analysis

Authors

etcd Watch and Lease Mechanism Analysis

etcd's Watch and Lease provide critical functionality for distributed systems. Watch enables real-time monitoring of key changes, and Lease enables automatic key expiration. This post analyzes the internal workings of both mechanisms in detail.


1. Watch Internal Structure

1.1 Watchable Store Architecture

The Watchable Store wraps the MVCC Store to add Watch functionality:

// watchableStore structure (simplified)
type watchableStore struct {
    *store                        // MVCC store
    synced   watcherGroup         // watchers synced to current revision
    unsynced watcherGroup         // watchers not yet caught up
    victims  []watcherBatch       // queue for failed event deliveries
}

1.2 Watcher Lifecycle

  1. Client sends Watch request
  2. Watcher created and start revision determined
  3. If start revision is older than current, placed in unsynced group
  4. Events replayed from history to catch up
  5. Moved to synced group once caught up to current revision
  6. Receives new events in real-time thereafter

2. Event Generation and Delivery

2.1 Event Generation Process

When a write occurs in the MVCC Store, events are generated at transaction commit time, delivered to matching watchers, and sent to clients via gRPC streams.

2.2 Event Filtering and Matching

Each Watcher defines its watch target:

  • Single key Watch: Watches exactly one key
  • Range Watch: Watches a key range (including prefix)
  • Filters: Can filter for only PUT or DELETE events

2.3 Event Order Guarantee

etcd Watch guarantees event ordering:

  • Events for the same key are delivered in revision order
  • Events are delivered in order per Watch ID
  • Clients can detect gaps by checking event revisions

3. gRPC Watch Streaming

3.1 Watch Stream Structure

Watch between client and server uses bidirectional gRPC streaming. Multiple Watches can be multiplexed on a single gRPC connection, each with a unique Watch ID.

3.2 Compacted Revision Handling

When a Watch's start revision has been compacted:

  1. Server returns ErrCompacted error with compactRevision
  2. Client reloads current data (Range request)
  3. After reload, starts new Watch from the returned revision
// Client-side compaction handling example
wch := client.Watch(ctx, "key", clientv3.WithRev(oldRev))
for resp := range wch {
    if resp.CompactRevision > 0 {
        // Compaction occurred - data reload needed
        reloadData()
        wch = client.Watch(ctx, "key", clientv3.WithRev(resp.CompactRevision))
    }
}

4. Lease Mechanism

4.1 Lease Overview

Lease assigns TTL (Time-To-Live) to keys for automatic expiration:

  • Creating a Lease assigns a unique ID and TTL
  • Keys attached to a Lease are deleted when the Lease expires
  • Multiple keys can be attached to a single Lease

4.2 Lease Grant and Usage

# Create Lease (TTL 300 seconds)
etcdctl lease grant 300

# Attach key to Lease
etcdctl put --lease=694d71ddafb1e01a mykey myvalue

4.3 TTL Implementation and Expiry

Lease expiration processing:

  1. Lessor periodically checks for expired Leases (every 500ms)
  2. When expired Lease found, proposes Revoke request to Raft
  3. After Raft consensus, all keys attached to the Lease are deleted
  4. Lease itself is removed

4.4 Lease KeepAlive

resp, _ := client.Grant(ctx, 30)
_, _ = client.Put(ctx, "key", "value", clientv3.WithLease(resp.ID))

ch, _ := client.KeepAlive(ctx, resp.ID)
for ka := range ch {
    fmt.Println("TTL renewed:", ka.TTL)
}

KeepAlive characteristics:

  • Client library automatically sends at TTL/3 intervals
  • Server resets TTL to original value
  • Efficient renewal via gRPC streaming

5. Lease in Kubernetes

5.1 Leader Election

Kubernetes leader election uses etcd Leases:

  1. Candidate writes its ID with Lease to a specific key (IfNotExists condition)
  2. Success means becoming leader
  3. Leader maintains Lease via KeepAlive
  4. On leader failure, KeepAlive stops and Lease expires
  5. After expiry, another candidate is elected as new leader

5.2 Node Heartbeat

kubelet uses etcd Lease objects in the kube-node-lease namespace to report node status. Periodic Lease renewal indicates node liveness. Failed renewal transitions node to NotReady.

5.3 API Server Lease Usage

kube-apiserver also uses etcd Leases for instance registration, service discovery, and distributed lock implementation.


6. Combined Watch and Lease Patterns

6.1 Distributed Configuration Management

  1. Store config values in etcd
  2. Clients Watch config keys
  3. Receive events on changes and apply immediately
  4. Lease for automatic cleanup of temporary configs

6.2 Service Registration/Discovery

  1. Service instances register with Lease
  2. Other services Watch registration keys to detect changes
  3. On instance failure, Lease expiry auto-deregisters
  4. Other services detect immediately via Watch

7. Summary

etcd's Watch and Lease provide core primitives for distributed systems. Watch enables real-time change detection, Lease enables TTL-based automatic expiration, and combining both implements patterns like leader election and service discovery. The next post analyzes the integration between etcd and the Kubernetes API Server.