- Authors

- Name
- Youngju Kim
- @fjvbn20031
- 1. Overview
- 2. Main Server Components
- 3. Goroutine Model and Lifecycle Management
- 4. Configuration Reload Mechanism
- 5. Target Discovery Pipeline
- 6. HTTP Client Configuration and TLS
- 7. Internal Metrics
- 8. Prometheus Operator Integration
- 9. Performance Optimization Tips
- 10. Summary
1. Overview
Prometheus is a CNCF graduated project and the de facto standard monitoring system for cloud-native environments. This post analyzes the Prometheus server's internal architecture at the source code level, examining how each component interacts and how lifecycles are managed.
The Prometheus main server is written in Go, with cmd/prometheus/main.go as the entry point. The server consists of multiple independent components, each running as a separate goroutine group.
2. Main Server Components
2.1 Overall Architecture
The Prometheus server consists of these core components:
+-----------------+
| Web UI/API |
+--------+--------+
|
+--------------+--------------+
| | |
+--------v---+ +------v------+ +----v-------+
| Scrape | | Rule | | Notifier |
| Manager | | Manager | | (Alertmgr) |
+--------+---+ +------+------+ +----+-------+
| | |
+--------------+--------------+
|
+--------v--------+
| TSDB |
+-----------------+
|
+--------v--------+
| Discovery |
| Manager |
+-----------------+
2.2 Component Roles
Scrape Manager: The core engine that collects metrics from targets. It receives target lists from the Discovery Manager and manages scraping loops for each target.
TSDB (Time Series Database): Handles storage and querying of time series data. A hierarchical storage system consisting of WAL, Head Block, and Persistent Blocks.
Rule Manager: Periodically evaluates Recording Rules and Alerting Rules. Evaluation results are written to TSDB or forwarded to the Notifier.
Notifier: Delivers active alerts to Alertmanager.
Web UI/API: HTTP server providing PromQL query endpoints, management APIs, and the built-in UI.
Discovery Manager: Discovers targets from various service discovery sources and delivers them to the Scrape Manager.
3. Goroutine Model and Lifecycle Management
3.1 Actor/Run Group Pattern
Prometheus uses the Run Group pattern from the oklog/run library to manage component lifecycles. Each component is registered with two functions:
// Each component registers as an (execute, interrupt) pair
g.Add(
func() error {
// execute: run the component's main logic
return component.Run(ctx)
},
func(err error) {
// interrupt: cleanup on shutdown
component.Stop()
},
)
The core behavior of the Run Group:
- All component execute functions start simultaneously
- When any execute function returns an error or exits, all other components' interrupt functions are called
- It waits until all execute functions have returned
This pattern naturally implements graceful shutdown. For example, when SIGTERM is received, the signal handler's execute returns, causing all other components to shut down sequentially.
3.2 Main Goroutine Groups
main goroutine
|
+-- Signal Handler goroutine
+-- Scrape Discovery Manager goroutine
+-- Notify Discovery Manager goroutine
+-- Scrape Manager goroutine
+-- Rule Manager goroutine
+-- TSDB goroutine
+-- Web Handler goroutine
+-- Notifier goroutine
+-- Remote Storage goroutine
Each goroutine runs independently, communicating through channels or synchronization primitives.
3.3 Component Initialization Order
Initialization follows the dependency order:
- TSDB initialization: Storage must be ready first
- Discovery Manager start: Target discovery begins
- Scrape Manager start: Receives targets from Discovery Manager and begins scraping
- Rule Manager start: Begins rule evaluation
- Notifier start: Prepares for alert delivery
- Web Handler start: HTTP server starts last to accept requests
4. Configuration Reload Mechanism
4.1 Reload Triggers
Prometheus supports two methods for configuration reload:
SIGHUP signal: Reload via OS signal
kill -HUP $(pidof prometheus)
HTTP API: Available when --web.enable-lifecycle flag is enabled
curl -X POST http://localhost:9090/-/reload
4.2 Reload Process
Configuration reload is handled in the reloadConfig function. The complete flow:
1. Receive SIGHUP or /-/reload call
2. Parse prometheus.yml file
3. Validate configuration
4. Propagate new configuration to each component:
a. Update Remote Storage configuration
b. Update Notifier configuration
c. Update Discovery Manager configuration
d. Update Scrape Manager configuration
e. Update Rule Manager configuration
f. Update Web Handler configuration
5. Update reload success/failure metrics
4.3 Per-Component Reload Behavior
Scrape Manager: Computes the diff between existing scrape pools and the new configuration. Only changed jobs are recreated while unchanged jobs are maintained, minimizing unnecessary disruption.
Rule Manager: Recreates all rule groups. Restores the state of existing rules (alert states, etc.) to the new rules.
Discovery Manager: Applies new service discovery configuration. Reuses unchanged providers from the existing configuration.
Notifier: Updates Alertmanager endpoint configuration.
4.4 Reload Failure Handling
If configuration file parsing or validation fails, the reload is aborted and the existing configuration is preserved. Reload failures can be monitored via the prometheus_config_last_reload_successful metric:
prometheus_config_last_reload_successful == 0
5. Target Discovery Pipeline
5.1 Complete Flow
The full pipeline from target discovery to scraping:
Service Discovery Provider
|
v
Discovery Manager (target groups)
|
v
Scrape Manager (apply relabel_configs)
|
v
Scrape Pool (per job)
|
v
Scrape Loop (per target)
|
v
HTTP GET /metrics
|
v
TSDB Appender
5.2 Discovery Manager
The Discovery Manager unifies multiple service discovery providers:
kubernetes_sd_config --|
consul_sd_config --|--> Discovery Manager --> Target Groups Channel
file_sd_config --|
static_config --|
Each provider runs in a separate goroutine, delivering target group changes through channels. The Discovery Manager collects these and forwards them to the Scrape Manager.
Provider updates are buffered for a configurable duration (default 5 seconds) before being delivered in batch. This prevents frequent changes from overloading the Scrape Manager.
5.3 Scrape Manager
When the Scrape Manager receives target groups from the Discovery Manager:
- Finds the corresponding Scrape Pool for each scrape job
- Applies relabel_configs to new targets
- Keeps or drops targets based on relabeling results
- Creates Scrape Loops for new targets
- Terminates Scrape Loops for disappeared targets
5.4 Scrape Pool
A Scrape Pool is a group of targets belonging to the same scrape job. Each Pool manages:
- Active target list
- Dropped target list (for debugging)
- Shared scraping configuration (interval, timeout, metrics path, etc.)
5.5 Scrape Loop
One Scrape Loop runs as a goroutine for each target:
Scrape Loop Cycle:
1. Wait for scrape_interval
2. Send HTTP GET request (within scrape_timeout)
3. Parse response (Prometheus Exposition Format)
4. Apply metric_relabel_configs
5. Write samples to TSDB
6. Update up metric (1=success, 0=failure)
7. Update internal metrics (scrape_duration_seconds, etc.)
8. Repeat from step 1
6. HTTP Client Configuration and TLS
6.1 HTTP Client Construction
The HTTP client for each scrape target is configured according to the scrape job settings:
scrape_configs:
- job_name: 'secure-targets'
scheme: https
tls_config:
ca_file: /path/to/ca.pem
cert_file: /path/to/cert.pem
key_file: /path/to/key.pem
insecure_skip_verify: false
basic_auth:
username: admin
password_file: /path/to/password
authorization:
type: Bearer
credentials_file: /path/to/token
6.2 TLS Configuration
Prometheus supports various TLS configurations:
- CA Certificate: For server certificate validation
- Client Certificate: For mTLS (mutual TLS) authentication
- Server Name Verification: SNI configuration via the server_name field
- Automatic Certificate Renewal: File-based credentials are periodically re-read
6.3 Authentication Mechanisms
Prometheus supports multiple authentication methods:
- Basic Auth: Username/password based
- Bearer Token: OAuth2 tokens, etc.
- OAuth2 Client: Client credentials flow support
- AWS SigV4: For AWS service access
Credential files (password_file, credentials_file, etc.) are re-read on each scrape cycle, so credentials can be updated without restarting Prometheus.
7. Internal Metrics
7.1 Key Self-Monitoring Metrics
Prometheus exposes various internal metrics for self-monitoring:
Scraping related:
prometheus_target_scrape_pool_targets: Number of targets in each scrape poolprometheus_target_scrapes_exceeded_sample_limit_total: Sample limit exceeded countscrape_duration_seconds: Scrape durationscrape_samples_scraped: Number of samples scraped
TSDB related:
prometheus_tsdb_head_series: Number of series in the Head Blockprometheus_tsdb_head_samples_appended_total: Number of samples appendedprometheus_tsdb_compactions_total: Number of compactions performedprometheus_tsdb_wal_corruptions_total: Number of WAL corruptions
Rule evaluation related:
prometheus_rule_evaluation_duration_seconds: Rule evaluation durationprometheus_rule_group_last_duration_seconds: Last rule group evaluation time
Configuration related:
prometheus_config_last_reload_successful: Whether last reload succeededprometheus_config_last_reload_success_timestamp_seconds: Last successful reload timestamp
7.2 Performance Tuning Indicators
Key indicators to monitor in production:
1. Scrape lag: scrape_duration_seconds > scrape_interval
- If scraping takes longer than the configured interval, data gaps may occur
2. Series cardinality: prometheus_tsdb_head_series
- Rapid increase signals cardinality explosion
3. Rule evaluation lag: prometheus_rule_group_last_duration_seconds > evaluation_interval
- Rule evaluation exceeding the interval causes alert delays
4. WAL size: prometheus_tsdb_wal_segment_current
- Abnormally large WAL signals compaction issues
8. Prometheus Operator Integration
8.1 Prometheus Operator Architecture
In Kubernetes environments, Prometheus is typically deployed via Prometheus Operator:
Prometheus Operator
|
+-- watches Prometheus CRD
| |
| +-- generates prometheus.yml
| +-- manages StatefulSet
|
+-- watches ServiceMonitor CRD
| |
| +-- converts to scrape_configs
|
+-- watches PodMonitor CRD
| |
| +-- converts to scrape_configs
|
+-- watches PrometheusRule CRD
|
+-- converts to rule files
8.2 Config Reloader Sidecar
Prometheus Operator uses a config-reloader sidecar container to automatically detect configuration changes and send reload requests to Prometheus:
1. Operator updates ConfigMap/Secret
2. Config Reloader detects mounted file changes
3. Sends /-/reload POST request
4. Prometheus applies new configuration
This sidecar monitors file changes using inotify (Linux), with periodic polling as a fallback.
9. Performance Optimization Tips
9.1 Scraping Optimization
- scrape_interval tuning: Set intervals matching metric change velocity. Too short increases load, too long reduces precision
- sample_limit configuration: Limit maximum samples per target to prevent cardinality explosion
- metric_relabel_configs: Drop unnecessary metrics immediately after scraping
9.2 TSDB Optimization
- --storage.tsdb.min-block-duration: Minimum Head Block retention time (default 2h)
- --storage.tsdb.max-block-duration: Maximum block size (default 10% of retention time or 31 days)
- --storage.tsdb.wal-compression: Enable WAL compression to reduce disk I/O
9.3 Query Optimization
- --query.max-concurrency: Limit concurrent queries (default 20)
- --query.timeout: Set query timeout (default 2 minutes)
- --query.max-samples: Limit maximum samples per query (default 50 million)
10. Summary
The Prometheus server's internal architecture demonstrates a clean design leveraging Go's concurrency model. The core design patterns are lifecycle management through Run Groups, channel-based inter-component communication, and pipeline-style data flow.
In the next post, we will analyze the TSDB internals in greater depth, examining WAL segment structure, chunk encoding, and block compaction algorithms at the source code level.