[Prometheus] Alerting Pipeline: From Rule Evaluation to Alertmanager Delivery

1. Overview
2. Rule Manager
3. Alert State Machine
4. Delivery to Alertmanager
5. Alertmanager Internals
- 5.1 Processing Pipeline
- 5.2 Routing Tree
6. Alert Grouping
- 6.1 Grouping Mechanism
- 6.2 Grouping Timing
7. Inhibition
- 7.1 Inhibition Rules
- 7.2 Inhibition Processing
8. Silencing
- 8.1 Creating Silences
- 8.2 Silence Processing
9. Deduplication
- 9.1 Deduplication Mechanism
- 9.2 Notification Log
10. Alertmanager HA Cluster
11. Notification Delivery
12. Monitoring and Debugging
13. Summary

1. Overview

The Prometheus alerting pipeline detects anomalies based on metric data and delivers notifications to appropriate receivers. The pipeline consists of two major components:

Prometheus Server: Rule evaluation and Alert generation
Alertmanager: Alert routing, grouping, inhibition, silencing, and notification delivery

This post analyzes the Rule Manager evaluation loop, Alert state machine, and Alertmanager internals at the source code level.

2. Rule Manager

2.1 Overall Structure

prometheus.yml rule_files
        |
        v
+-------------------+
|   Rule Manager    |
|                   |
|  +-- Rule Group 1 (evaluation_interval: 15s)
|  |     +-- recording rule A
|  |     +-- alerting rule B
|  |     +-- alerting rule C
|  |
|  +-- Rule Group 2 (evaluation_interval: 30s)
|  |     +-- recording rule D
|  |     +-- alerting rule E
|  |
+-------------------+
        |
   Alert delivery
        |
        v
+-------------------+
|  Alertmanager     |
+-------------------+

2.2 Rule Group

A Rule Group is a collection of related rules that are evaluated sequentially at the same interval:

groups:
  - name: node_alerts
    interval: 15s # defaults to global.evaluation_interval
    rules:
      - record: instance:node_cpu:rate5m
        expr: 1 - avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m]))

      - alert: HighCPU
        expr: instance:node_cpu:rate5m > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: 'CPU usage exceeds 90%'

2.3 Evaluation Loop

Rule Group evaluation loop:

1. evaluation_interval timer fires
        |
        v
2. Evaluate rules in group sequentially
   (recording rules first, then alerting rules)
        |
        v
3. Execute each rule's PromQL expression against TSDB
        |
        v
4. Recording rule: store result as new time series in TSDB
   Alerting rule: pass result to Alert state machine
        |
        v
5. Send active Alerts to Alertmanager
        |
        v
6. Wait until next evaluation cycle

Rules are evaluated sequentially within a group, so recording rule results can be referenced by alerting rules in the same group.

2.4 Evaluation Timing

Evaluation timing management:

- Each Rule Group runs in an independent goroutine
- Evaluation start times are aligned to evaluation_interval
  (e.g., 15s interval: :00, :15, :30...)
- If evaluation takes longer than interval, next evaluation is skipped
- Skipped evaluations are recorded as metrics:
  prometheus_rule_group_iterations_missed_total

3. Alert State Machine

3.1 State Transitions

Alert state machine:

  +----------+
  | inactive |
  +----+-----+
       |
       | Expression result exists (matched)
       v
  +---------+
  | pending |  (waiting for 'for' duration)
  +----+----+
       |
       | 'for' duration elapsed
       v
  +---------+
  | firing  |  (sent to Alertmanager)
  +----+----+
       |
       | Expression result absent (not matched)
       v
  +----------+
  | resolved |  (resolved, sent to Alertmanager)
  +----+-----+
       |
       | Next evaluation cycle
       v
  +----------+
  | inactive |
  +----------+

Note: If expression result is absent while in pending, transitions directly to inactive

3.2 for Duration

The for field specifies how long a condition must persist before the Alert transitions to firing:

for duration behavior:

Time  ExprResult  State
0s    true        inactive -> pending (ActiveAt = 0s)
15s   true        pending (elapsed: 15s)
30s   true        pending (elapsed: 30s)
...
5m    true        pending -> firing (for: 5m satisfied)
5m15s true        firing (continues sending)
5m30s false       firing -> resolved
5m45s -           resolved -> inactive

Without for (for: 0s):
0s    true        inactive -> firing (immediately)

3.3 Alert Identity

Each Alert instance is uniquely identified by its label set:

Alert identity:

alertname + expression result labels + additional labels field
= Alert's unique fingerprint

Example:
  alert: HighCPU
  expr: instance:node_cpu:rate5m > 0.9
  labels:
    severity: warning

If result label is instance="node-1":
  fingerprint = hash(alertname=HighCPU, instance=node-1, severity=warning)

Same rule with different instance values produces separate Alerts

4. Delivery to Alertmanager

4.1 Delivery Mechanism

Alert delivery flow:

1. Collect active Alerts from evaluation loop
   (firing + resolved)
        |
        v
2. Serialize Alerts to API format
   POST /api/v2/alerts
        |
        v
3. Send to all configured Alertmanager instances
   (alerting.alertmanagers configuration)
        |
        v
4. On delivery failure, resend in next evaluation cycle
   (Alerts are resent every evaluation cycle)

4.2 Alert Data Format

Alert data structure:

- labels:           Alert identification label map
- annotations:      Additional info (summary, description, etc.)
- startsAt:         Alert start time
- endsAt:           Alert end time (resolved or expected end)
- generatorURL:     Prometheus expression link

Firing state:
  endsAt = current_time + 4 * evaluation_interval
  (to prevent expiry before next send)

Resolved state:
  endsAt = resolution time

4.3 Alertmanager Discovery

Alertmanager discovery:

1. Static configuration:
   alerting:
     alertmanagers:
       - static_configs:
           - targets: ['alertmanager-1:9093', 'alertmanager-2:9093']

2. Service discovery:
   alerting:
     alertmanagers:
       - kubernetes_sd_configs:
           - role: pod
         relabel_configs:
           - source_labels: [__meta_kubernetes_pod_label_app]
             regex: alertmanager
             action: keep

Prometheus sends Alerts to all discovered Alertmanagers.

5. Alertmanager Internals

5.1 Processing Pipeline

Alertmanager processing pipeline:

Alert received (API)
    |
    v
Dispatcher
    |-- Routing tree matching
    |-- Alert grouping
    |
    v
Notification Pipeline
    |-- Wait (group wait)
    |-- Dedup (deduplication)
    |-- Retry
    |-- Inhibit (inhibition check)
    |-- Silence (silencing check)
    |-- Notify (actual delivery)
    |
    v
Receiver
    |-- email
    |-- Slack
    |-- PagerDuty
    |-- Webhook
    |-- ...

5.2 Routing Tree

The routing tree is a hierarchical structure that matches Alerts to appropriate receivers:

# alertmanager.yml
route:
  receiver: 'default-receiver'
  group_by: ['alertname', 'cluster']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  routes:
    - match:
        severity: critical
      receiver: 'pagerduty-critical'
      group_wait: 10s
      routes:
        - match:
            service: database
          receiver: 'dba-pagerduty'
    - match:
        severity: warning
      receiver: 'slack-warnings'
      group_by: ['alertname', 'service']

Routing tree matching:

root (default-receiver)
  |
  +-- severity=critical -> pagerduty-critical
  |     |
  |     +-- service=database -> dba-pagerduty
  |
  +-- severity=warning -> slack-warnings

Matching order:
1. Traverse child routes top to bottom
2. Select first matching route (continue: false by default)
3. If continue: true, continue checking next routes
4. If no child matches, use current node's receiver

6. Alert Grouping

6.1 Grouping Mechanism

Grouping behavior:

group_by: ['alertname', 'cluster']

Alert 1: alertname=HighCPU, cluster=prod, instance=node-1
Alert 2: alertname=HighCPU, cluster=prod, instance=node-2
Alert 3: alertname=HighCPU, cluster=staging, instance=node-3
Alert 4: alertname=DiskFull, cluster=prod, instance=node-1

Group result:
  Group 1: (alertname=HighCPU, cluster=prod)     -> Alert 1, 2
  Group 2: (alertname=HighCPU, cluster=staging)  -> Alert 3
  Group 3: (alertname=DiskFull, cluster=prod)     -> Alert 4

Each group is sent as a single notification.

6.2 Grouping Timing

Grouping timing parameters:

group_wait: 30s
  - Wait time before sending first notification for a new group
  - Collects Alerts added to the same group during this window
  - Prevents duplicate notifications during initial Alert storms

group_interval: 5m
  - Interval for resending notifications when new Alerts join a group
  - Not applied if only existing Alerts are present

repeat_interval: 4h
  - Interval for resending notifications for unchanged groups
  - Periodic reminder so receivers don't miss Alerts

7. Inhibition

7.1 Inhibition Rules

Inhibition suppresses notifications for certain Alerts when other specific Alerts are active:

inhibit_rules:
  - source_match:
      severity: critical
    target_match:
      severity: warning
    equal: ['alertname', 'cluster']

Inhibition behavior:

Source Alert (exists and firing):
  alertname=HighCPU, cluster=prod, severity=critical

Target Alert (to be inhibited):
  alertname=HighCPU, cluster=prod, severity=warning

Since equal fields match, target Alert notifications are inhibited.
When critical is firing for the same issue, warning is not notified.

7.2 Inhibition Processing

Inhibition processing flow:

1. Inhibit stage runs in Notification Pipeline
2. Check active Alert list for source_match
3. If matching source Alert exists
4. Verify equal field values match
5. If equal, suppress target Alert notification
6. Inhibited Alerts still appear in UI (state is maintained)

8. Silencing

8.1 Creating Silences

A silence temporarily suppresses notifications for Alerts matching specific conditions:

Silence configuration:

- matchers:     Label matchers (regex supported)
- startsAt:     Silence start time
- endsAt:       Silence end time
- createdBy:    Creator
- comment:      Reason

Example:
  matchers:
    alertname = HighCPU
    cluster = prod
  startsAt: 2026-03-20T10:00:00Z
  endsAt:   2026-03-20T14:00:00Z
  comment: "Planned maintenance"

8.2 Silence Processing

Silence matching:

1. Runs at the Silence stage in Notification Pipeline
2. Iterates over active silence list
3. Evaluates each silence's matchers against Alert labels
4. If all matchers match, the Alert is silenced
5. Silenced Alerts are displayed in Alertmanager UI
6. Notifications automatically resume after silence expires

9. Deduplication

9.1 Deduplication Mechanism

Deduplication scenarios:

1. Repeated receipt of the same Alert:
   - Prometheus resends firing Alerts every evaluation cycle
   - Alertmanager prevents duplicate notifications for the same Alert
   - Delivery records stored in Notification Log

2. HA cluster deduplication:
   - Multiple Prometheus instances send the same Alert
   - Alertmanager cluster shares delivery state via gossip
   - Only one Alertmanager sends the actual notification

9.2 Notification Log

Notification Log:

- Stores notification delivery records for each Alert group
- Key: group fingerprint + receiver
- Value: last delivery time, delivered Alert fingerprint list
- Compared with repeat_interval to decide resend
- Synchronized via gossip protocol in HA cluster

10. Alertmanager HA Cluster

10.1 Cluster Architecture

Alertmanager HA architecture:

Prometheus 1 --+                          +--> Slack
Prometheus 2 --+--> [Alertmanager 1] <--> +--> PagerDuty
                    [Alertmanager 2] <-->
                    [Alertmanager 3] <-->
                         |
                    gossip protocol
                    (Memberlist)

10.2 Gossip Protocol

Alertmanager uses a Memberlist (HashiCorp)-based gossip protocol:

Data synchronized via gossip:

1. Notification Log:
   - Which Alert groups have been notified
   - Prevents duplicate delivery if another instance already sent

2. Silence state:
   - Created/modified/deleted silence information
   - Same silences applied across all instances

Synchronization mechanism:
  - Periodically propagates state to random peers
  - Full state sync when new instance joins
  - Eventual consistency model

10.3 HA Configuration

# Starting Alertmanager cluster
alertmanager --config.file=alertmanager.yml \
  --cluster.listen-address=0.0.0.0:9094 \
  --cluster.peer=alertmanager-1:9094 \
  --cluster.peer=alertmanager-2:9094

HA behavior:

1. All instances receive Alerts
2. All instances perform routing/grouping independently
3. Dedup stage runs in Notification Pipeline
4. Check Notification Log via gossip
5. Only send actual notification if not yet delivered
6. Record delivery result in Notification Log and propagate via gossip

11. Notification Delivery

11.1 Receiver Types

Built-in receivers:
  - email:      SMTP email
  - slack:      Slack webhook
  - pagerduty:  PagerDuty Events API
  - opsgenie:   OpsGenie API
  - victorops:  VictorOps API
  - webhook:    Generic HTTP webhook
  - wechat:     WeChat
  - pushover:   Pushover
  - sns:        AWS SNS
  - telegram:   Telegram Bot API
  - webex:      Webex Teams
  - msteams:    Microsoft Teams

11.2 Template System

Notification templates:

Alertmanager uses Go templates to compose notification content.

Available data:
  .Status:       firing or resolved
  .Alerts:       Alert list
  .GroupLabels:  Labels used for grouping
  .CommonLabels: Labels common to all Alerts
  .ExternalURL:  Alertmanager external URL

Data available per Alert:
  .Labels:       Alert labels
  .Annotations:  Alert annotations
  .StartsAt:     Start time
  .EndsAt:       End time
  .GeneratorURL: Prometheus link

11.3 Retry Mechanism

Notification delivery retry:

1. On notification delivery failure
2. Retry with exponential backoff
   Initial interval: 1s, max interval: 5m
3. After max retries exhausted, log the failure
4. Try again at next repeat_interval

12. Monitoring and Debugging

12.1 Prometheus Server Metrics

Rule evaluation metrics:
  prometheus_rule_evaluations_total:        Total rule evaluation count
  prometheus_rule_evaluation_failures_total: Rule evaluation failure count
  prometheus_rule_group_duration_seconds:    Group evaluation duration
  prometheus_rule_group_iterations_missed_total: Skipped evaluation count

Alert metrics:
  prometheus_alerts:                         Current active Alerts (by state)
  prometheus_notifications_total:             Alertmanager delivery count
  prometheus_notifications_errors_total:      Delivery failure count
  prometheus_notifications_dropped_total:     Dropped notification count
  prometheus_notifications_queue_length:      Delivery queue length

12.2 Alertmanager Metrics

Alertmanager metrics:
  alertmanager_alerts:                       Current active Alerts
  alertmanager_alerts_received_total:         Received Alert count
  alertmanager_alerts_invalid_total:          Invalid Alert count
  alertmanager_notifications_total:           Sent notification count (by receiver)
  alertmanager_notifications_failed_total:    Delivery failure count
  alertmanager_silences:                     Active silence count
  alertmanager_cluster_members:              Cluster member count
  alertmanager_cluster_messages_received_total: Gossip message count

12.3 Common Troubleshooting

1. Alert not firing:
   - Manually execute PromQL expression to verify results
   - Verify for duration has elapsed sufficiently
   - Check evaluation_interval setting

2. Notifications not received:
   - Check Alertmanager connectivity (Prometheus /targets)
   - Verify routing tree matching (amtool config routes test)
   - Check silences (Alertmanager UI)
   - Review inhibition rules

3. Duplicate notifications:
   - Verify group_by settings
   - Check repeat_interval
   - Verify HA cluster gossip health

13. Summary

The Prometheus alerting pipeline consists of the Rule Manager's periodic evaluation, the Alert state machine's precise state management, and Alertmanager's multi-stage processing pipeline. Grouping mitigates notification storms, inhibition and silencing suppress unnecessary notifications, and gossip-based HA clustering ensures high availability.