Split View: Prometheus 아키텍처 내부 분석: 소스코드 레벨 딥다이브

Prometheus 아키텍처 내부 분석: 소스코드 레벨 딥다이브

1. 개요
2. 메인 서버 컴포넌트
- 2.1 전체 아키텍처
- 2.2 컴포넌트별 역할
3. Goroutine 모델과 생명주기 관리
4. 설정 리로드 메커니즘
5. 타겟 디스커버리 파이프라인
6. HTTP 클라이언트 설정과 TLS
7. 내부 메트릭
- 7.1 주요 자기 모니터링 메트릭
- 7.2 성능 튜닝 지표
8. Prometheus Operator와의 통합
- 8.1 Prometheus Operator 아키텍처
- 8.2 Config Reloader 사이드카
9. 성능 최적화 팁
10. 정리

1. 개요

Prometheus는 CNCF 졸업 프로젝트로, 클라우드 네이티브 환경의 사실상 표준 모니터링 시스템입니다. 이 글에서는 Prometheus 서버의 내부 아키텍처를 소스코드 레벨에서 분석하여, 각 컴포넌트가 어떻게 상호작용하고 생명주기가 관리되는지 살펴봅니다.

Prometheus 메인 서버는 Go 언어로 작성되어 있으며, cmd/prometheus/main.go가 진입점입니다. 서버는 여러 독립적인 컴포넌트로 구성되며, 각 컴포넌트는 별도의 goroutine 그룹으로 실행됩니다.

2. 메인 서버 컴포넌트

2.1 전체 아키텍처

Prometheus 서버는 다음과 같은 핵심 컴포넌트로 구성됩니다:

                    +-----------------+
                    |   Web UI/API    |
                    +--------+--------+
                             |
              +--------------+--------------+
              |              |              |
     +--------v---+  +------v------+  +----v-------+
     | Scrape     |  | Rule        |  | Notifier   |
     | Manager    |  | Manager     |  | (Alertmgr) |
     +--------+---+  +------+------+  +----+-------+
              |              |              |
              +--------------+--------------+
                             |
                    +--------v--------+
                    |     TSDB        |
                    +-----------------+
                             |
                    +--------v--------+
                    | Discovery       |
                    | Manager         |
                    +-----------------+

2.2 컴포넌트별 역할

Scrape Manager: 타겟에서 메트릭을 수집하는 핵심 엔진입니다. Discovery Manager로부터 타겟 목록을 받아 각 타겟에 대한 스크래핑 루프를 관리합니다.

TSDB (Time Series Database): 시계열 데이터의 저장과 쿼리를 담당합니다. WAL, Head Block, Persistent Block으로 구성된 계층적 스토리지입니다.

Rule Manager: Recording Rule과 Alerting Rule을 주기적으로 평가합니다. 평가 결과는 TSDB에 기록되거나 Notifier로 전달됩니다.

Notifier: 활성화된 알림을 Alertmanager에 전달하는 컴포넌트입니다.

Web UI/API: HTTP 서버로, PromQL 쿼리 엔드포인트, 관리 API, 내장 UI를 제공합니다.

Discovery Manager: 다양한 서비스 디스커버리 소스에서 타겟을 발견하고 Scrape Manager에 전달합니다.

3. Goroutine 모델과 생명주기 관리

3.1 Actor/Run Group 패턴

Prometheus는 oklog/run 라이브러리의 Run Group 패턴을 사용하여 컴포넌트의 생명주기를 관리합니다. 각 컴포넌트는 두 개의 함수로 등록됩니다:

// 각 컴포넌트는 (execute, interrupt) 쌍으로 등록
g.Add(
    func() error {
        // execute: 컴포넌트의 메인 로직 실행
        return component.Run(ctx)
    },
    func(err error) {
        // interrupt: 종료 시 정리 로직
        component.Stop()
    },
)

Run Group의 핵심 동작 원리는 다음과 같습니다:

모든 컴포넌트의 execute 함수가 동시에 시작됩니다
하나의 execute 함수가 에러를 반환하거나 종료되면, 모든 다른 컴포넌트의 interrupt 함수가 호출됩니다
모든 execute 함수가 반환될 때까지 대기합니다

이 패턴은 graceful shutdown을 자연스럽게 구현합니다. 예를 들어 SIGTERM을 받으면 시그널 핸들러의 execute가 반환되고, 이에 따라 다른 모든 컴포넌트가 순차적으로 종료됩니다.

3.2 주요 Goroutine 그룹

main goroutine
  |
  +-- Signal Handler goroutine
  +-- Scrape Discovery Manager goroutine
  +-- Notify Discovery Manager goroutine
  +-- Scrape Manager goroutine
  +-- Rule Manager goroutine
  +-- TSDB goroutine
  +-- Web Handler goroutine
  +-- Notifier goroutine
  +-- Remote Storage goroutine

각 goroutine은 독립적으로 실행되며, 채널이나 동기화 프리미티브를 통해 통신합니다.

3.3 컴포넌트 초기화 순서

초기화는 의존성 순서에 따라 진행됩니다:

TSDB 초기화: 스토리지가 가장 먼저 준비되어야 합니다
Discovery Manager 시작: 타겟 검색이 시작됩니다
Scrape Manager 시작: Discovery Manager의 타겟을 수신하여 스크래핑을 시작합니다
Rule Manager 시작: 규칙 평가를 시작합니다
Notifier 시작: 알림 전송을 준비합니다
Web Handler 시작: HTTP 서버가 마지막에 시작되어 요청을 수신합니다

4. 설정 리로드 메커니즘

4.1 리로드 트리거

Prometheus는 두 가지 방법으로 설정을 리로드할 수 있습니다:

SIGHUP 시그널: OS 시그널을 통한 리로드

kill -HUP $(pidof prometheus)

HTTP API: --web.enable-lifecycle 플래그 활성화 시 사용 가능

curl -X POST http://localhost:9090/-/reload

4.2 리로드 프로세스

설정 리로드는 reloadConfig 함수에서 처리됩니다. 전체 흐름은 다음과 같습니다:

1. SIGHUP 수신 또는 /-/reload 호출
2. prometheus.yml 파일 파싱
3. 설정 유효성 검증
4. 각 컴포넌트에 새 설정 전파:
   a. Remote Storage 설정 업데이트
   b. Notifier 설정 업데이트
   c. Discovery Manager 설정 업데이트
   d. Scrape Manager 설정 업데이트
   e. Rule Manager 설정 업데이트
   f. Web Handler 설정 업데이트
5. 리로드 성공/실패 메트릭 업데이트

4.3 컴포넌트별 리로드 동작

Scrape Manager: 기존 스크래핑 풀과 새 설정의 차이를 계산합니다. 변경된 잡만 재생성하고, 변경되지 않은 잡은 그대로 유지합니다. 이를 통해 불필요한 중단을 최소화합니다.

Rule Manager: 모든 규칙 그룹을 재생성합니다. 기존 규칙의 상태(알림 상태 등)를 새 규칙으로 복원합니다.

Discovery Manager: 새로운 서비스 디스커버리 설정을 적용합니다. 기존 프로바이더 중 변경되지 않은 것은 재사용합니다.

Notifier: Alertmanager 엔드포인트 설정을 업데이트합니다.

4.4 리로드 실패 처리

설정 파일 파싱이나 유효성 검증에 실패하면, 리로드가 중단되고 기존 설정이 유지됩니다. 리로드 실패는 prometheus_config_last_reload_successful 메트릭으로 모니터링할 수 있습니다:

prometheus_config_last_reload_successful == 0

5. 타겟 디스커버리 파이프라인

5.1 전체 흐름

타겟이 발견되어 스크래핑되기까지의 전체 파이프라인:

Service Discovery Provider
        |
        v
Discovery Manager (target groups)
        |
        v
Scrape Manager (apply relabel_configs)
        |
        v
Scrape Pool (per job)
        |
        v
Scrape Loop (per target)
        |
        v
HTTP GET /metrics
        |
        v
TSDB Appender

5.2 Discovery Manager

Discovery Manager는 여러 서비스 디스커버리 프로바이더를 통합 관리합니다:

kubernetes_sd_config --|
consul_sd_config    --|-->  Discovery Manager  -->  Target Groups Channel
file_sd_config      --|
static_config       --|

각 프로바이더는 별도의 goroutine에서 실행되며, 타겟 그룹 변경 사항을 채널로 전달합니다. Discovery Manager는 이를 수집하여 Scrape Manager에 전달합니다.

프로바이더의 업데이트는 일정 시간(기본 5초) 동안 버퍼링된 후 일괄 전달됩니다. 이를 통해 빈번한 변경이 발생해도 Scrape Manager에 과도한 부하를 주지 않습니다.

5.3 Scrape Manager

Scrape Manager는 Discovery Manager로부터 타겟 그룹을 수신하면:

각 scrape job에 해당하는 Scrape Pool을 찾습니다
새로운 타겟에 대해 relabel_configs를 적용합니다
relabeling 결과에 따라 타겟을 유지하거나 삭제합니다
새 타겟에 대한 Scrape Loop를 생성합니다
사라진 타겟의 Scrape Loop를 종료합니다

5.4 Scrape Pool

Scrape Pool은 동일한 scrape job에 속하는 타겟들의 그룹입니다. 각 Pool은 다음을 관리합니다:

활성 타겟 목록
삭제된 타겟 목록(디버깅용)
공통 스크래핑 설정(간격, 타임아웃, 메트릭 경로 등)

5.5 Scrape Loop

각 타겟에 대해 하나의 Scrape Loop가 goroutine으로 실행됩니다:

Scrape Loop Cycle:
1. scrape_interval 대기
2. HTTP GET 요청 전송 (scrape_timeout 내)
3. 응답 파싱 (Prometheus Exposition Format)
4. metric_relabel_configs 적용
5. TSDB에 샘플 기록
6. up 메트릭 업데이트 (1=성공, 0=실패)
7. scrape_duration_seconds 등 내부 메트릭 업데이트
8. 1번으로 반복

6. HTTP 클라이언트 설정과 TLS

6.1 HTTP 클라이언트 구성

각 스크래핑 타겟에 대한 HTTP 클라이언트는 scrape job 설정에 따라 구성됩니다:

scrape_configs:
  - job_name: 'secure-targets'
    scheme: https
    tls_config:
      ca_file: /path/to/ca.pem
      cert_file: /path/to/cert.pem
      key_file: /path/to/key.pem
      insecure_skip_verify: false
    basic_auth:
      username: admin
      password_file: /path/to/password
    authorization:
      type: Bearer
      credentials_file: /path/to/token

6.2 TLS 설정

Prometheus는 다양한 TLS 설정을 지원합니다:

CA 인증서: 서버 인증서 검증용
클라이언트 인증서: mTLS(상호 TLS) 인증용
서버 이름 검증: server_name 필드로 SNI 설정
인증서 자동 갱신: 파일 기반 자격증명은 주기적으로 다시 읽힘

6.3 인증 메커니즘

Prometheus는 여러 인증 방식을 지원합니다:

Basic Auth: 사용자명/비밀번호 기반
Bearer Token: OAuth2 토큰 등
OAuth2 클라이언트: 클라이언트 자격증명 흐름 지원
AWS SigV4: AWS 서비스 접근용

자격증명 파일(password_file, credentials_file 등)은 각 스크래핑 사이클마다 다시 읽히므로, 파일을 업데이트하면 Prometheus 재시작 없이 자격증명을 변경할 수 있습니다.

7. 내부 메트릭

7.1 주요 자기 모니터링 메트릭

Prometheus는 자체 동작을 모니터링하기 위한 다양한 내부 메트릭을 노출합니다:

스크래핑 관련:

prometheus_target_scrape_pool_targets: 각 스크래핑 풀의 타겟 수
prometheus_target_scrapes_exceeded_sample_limit_total: 샘플 제한 초과 횟수
scrape_duration_seconds: 스크래핑 소요 시간
scrape_samples_scraped: 스크래핑된 샘플 수

TSDB 관련:

prometheus_tsdb_head_series: Head Block의 시계열 수
prometheus_tsdb_head_samples_appended_total: 추가된 샘플 수
prometheus_tsdb_compactions_total: 컴팩션 수행 횟수
prometheus_tsdb_wal_corruptions_total: WAL 손상 횟수

규칙 평가 관련:

prometheus_rule_evaluation_duration_seconds: 규칙 평가 소요 시간
prometheus_rule_group_last_duration_seconds: 마지막 규칙 그룹 평가 시간

설정 관련:

prometheus_config_last_reload_successful: 마지막 리로드 성공 여부
prometheus_config_last_reload_success_timestamp_seconds: 마지막 성공 시각

7.2 성능 튜닝 지표

운영 환경에서 모니터링해야 할 핵심 지표들:

1. 스크래핑 지연: scrape_duration_seconds > scrape_interval
   - 스크래핑이 설정된 간격보다 오래 걸리면 데이터 누락 가능

2. 시계열 카디널리티: prometheus_tsdb_head_series
   - 급격한 증가는 카디널리티 폭발 신호

3. 규칙 평가 지연: prometheus_rule_group_last_duration_seconds > evaluation_interval
   - 규칙 평가가 간격을 초과하면 알림 지연 발생

4. WAL 크기: prometheus_tsdb_wal_segment_current
   - 비정상적으로 큰 WAL은 컴팩션 문제 신호

8. Prometheus Operator와의 통합

8.1 Prometheus Operator 아키텍처

Kubernetes 환경에서 Prometheus는 주로 Prometheus Operator를 통해 배포됩니다:

Prometheus Operator
  |
  +-- watches Prometheus CRD
  |     |
  |     +-- generates prometheus.yml
  |     +-- manages StatefulSet
  |
  +-- watches ServiceMonitor CRD
  |     |
  |     +-- converts to scrape_configs
  |
  +-- watches PodMonitor CRD
  |     |
  |     +-- converts to scrape_configs
  |
  +-- watches PrometheusRule CRD
        |
        +-- converts to rule files

8.2 Config Reloader 사이드카

Prometheus Operator는 config-reloader 사이드카 컨테이너를 사용하여 설정 변경을 자동으로 감지하고 Prometheus에 리로드 요청을 보냅니다:

1. Operator가 ConfigMap/Secret 업데이트
2. Config Reloader가 마운트된 파일 변경 감지
3. /-/reload POST 요청 전송
4. Prometheus가 새 설정 적용

이 사이드카는 inotify(Linux) 기반으로 파일 변경을 감시하며, 주기적인 폴링도 fallback으로 수행합니다.

9. 성능 최적화 팁

9.1 스크래핑 최적화

scrape_interval 조정: 메트릭 변화 속도에 맞는 간격 설정. 너무 짧으면 부하 증가, 너무 길면 정밀도 감소
sample_limit 설정: 타겟별 최대 샘플 수 제한으로 카디널리티 폭발 방지
metric_relabel_configs: 불필요한 메트릭을 스크래핑 후 즉시 삭제

9.2 TSDB 최적화

--storage.tsdb.min-block-duration: Head Block의 최소 유지 시간 (기본 2h)
--storage.tsdb.max-block-duration: 블록의 최대 크기 (기본 시간의 10% 또는 31일)
--storage.tsdb.wal-compression: WAL 압축 활성화로 디스크 I/O 감소

9.3 쿼리 최적화

--query.max-concurrency: 동시 쿼리 수 제한 (기본 20)
--query.timeout: 쿼리 타임아웃 설정 (기본 2분)
--query.max-samples: 쿼리당 최대 샘플 수 제한 (기본 5천만)

10. 정리

Prometheus 서버의 내부 아키텍처는 Go의 동시성 모델을 활용한 깔끔한 설계를 보여줍니다. Run Group 패턴을 통한 생명주기 관리, 채널 기반의 컴포넌트 간 통신, 파이프라인 형태의 데이터 흐름이 핵심입니다.

다음 글에서는 TSDB의 내부 구조를 더 깊이 분석하겠습니다. WAL의 세그먼트 구조, 청크 인코딩, 블록 컴팩션 알고리즘 등을 소스코드 레벨에서 살펴볼 예정입니다.

[Prometheus] Architecture Internals: Source Code Level Deep Dive

1. Overview
2. Main Server Components
- 2.1 Overall Architecture
- 2.2 Component Roles
3. Goroutine Model and Lifecycle Management
4. Configuration Reload Mechanism
5. Target Discovery Pipeline
6. HTTP Client Configuration and TLS
7. Internal Metrics
- 7.1 Key Self-Monitoring Metrics
- 7.2 Performance Tuning Indicators
8. Prometheus Operator Integration
- 8.1 Prometheus Operator Architecture
- 8.2 Config Reloader Sidecar
9. Performance Optimization Tips
10. Summary

1. Overview

Prometheus is a CNCF graduated project and the de facto standard monitoring system for cloud-native environments. This post analyzes the Prometheus server's internal architecture at the source code level, examining how each component interacts and how lifecycles are managed.

The Prometheus main server is written in Go, with cmd/prometheus/main.go as the entry point. The server consists of multiple independent components, each running as a separate goroutine group.

2. Main Server Components

2.1 Overall Architecture

The Prometheus server consists of these core components:

                    +-----------------+
                    |   Web UI/API    |
                    +--------+--------+
                             |
              +--------------+--------------+
              |              |              |
     +--------v---+  +------v------+  +----v-------+
     | Scrape     |  | Rule        |  | Notifier   |
     | Manager    |  | Manager     |  | (Alertmgr) |
     +--------+---+  +------+------+  +----+-------+
              |              |              |
              +--------------+--------------+
                             |
                    +--------v--------+
                    |     TSDB        |
                    +-----------------+
                             |
                    +--------v--------+
                    | Discovery       |
                    | Manager         |
                    +-----------------+

2.2 Component Roles

Scrape Manager: The core engine that collects metrics from targets. It receives target lists from the Discovery Manager and manages scraping loops for each target.

TSDB (Time Series Database): Handles storage and querying of time series data. A hierarchical storage system consisting of WAL, Head Block, and Persistent Blocks.

Rule Manager: Periodically evaluates Recording Rules and Alerting Rules. Evaluation results are written to TSDB or forwarded to the Notifier.

Notifier: Delivers active alerts to Alertmanager.

Web UI/API: HTTP server providing PromQL query endpoints, management APIs, and the built-in UI.

Discovery Manager: Discovers targets from various service discovery sources and delivers them to the Scrape Manager.

3. Goroutine Model and Lifecycle Management

3.1 Actor/Run Group Pattern

Prometheus uses the Run Group pattern from the oklog/run library to manage component lifecycles. Each component is registered with two functions:

// Each component registers as an (execute, interrupt) pair
g.Add(
    func() error {
        // execute: run the component's main logic
        return component.Run(ctx)
    },
    func(err error) {
        // interrupt: cleanup on shutdown
        component.Stop()
    },
)

The core behavior of the Run Group:

All component execute functions start simultaneously
When any execute function returns an error or exits, all other components' interrupt functions are called
It waits until all execute functions have returned

This pattern naturally implements graceful shutdown. For example, when SIGTERM is received, the signal handler's execute returns, causing all other components to shut down sequentially.

3.2 Main Goroutine Groups

main goroutine
  |
  +-- Signal Handler goroutine
  +-- Scrape Discovery Manager goroutine
  +-- Notify Discovery Manager goroutine
  +-- Scrape Manager goroutine
  +-- Rule Manager goroutine
  +-- TSDB goroutine
  +-- Web Handler goroutine
  +-- Notifier goroutine
  +-- Remote Storage goroutine

Each goroutine runs independently, communicating through channels or synchronization primitives.

3.3 Component Initialization Order

Initialization follows the dependency order:

TSDB initialization: Storage must be ready first
Discovery Manager start: Target discovery begins
Scrape Manager start: Receives targets from Discovery Manager and begins scraping
Rule Manager start: Begins rule evaluation
Notifier start: Prepares for alert delivery
Web Handler start: HTTP server starts last to accept requests

4. Configuration Reload Mechanism

4.1 Reload Triggers

Prometheus supports two methods for configuration reload:

SIGHUP signal: Reload via OS signal

kill -HUP $(pidof prometheus)

HTTP API: Available when --web.enable-lifecycle flag is enabled

curl -X POST http://localhost:9090/-/reload

4.2 Reload Process

Configuration reload is handled in the reloadConfig function. The complete flow:

1. Receive SIGHUP or /-/reload call
2. Parse prometheus.yml file
3. Validate configuration
4. Propagate new configuration to each component:
   a. Update Remote Storage configuration
   b. Update Notifier configuration
   c. Update Discovery Manager configuration
   d. Update Scrape Manager configuration
   e. Update Rule Manager configuration
   f. Update Web Handler configuration
5. Update reload success/failure metrics

4.3 Per-Component Reload Behavior

Scrape Manager: Computes the diff between existing scrape pools and the new configuration. Only changed jobs are recreated while unchanged jobs are maintained, minimizing unnecessary disruption.

Rule Manager: Recreates all rule groups. Restores the state of existing rules (alert states, etc.) to the new rules.

Discovery Manager: Applies new service discovery configuration. Reuses unchanged providers from the existing configuration.

Notifier: Updates Alertmanager endpoint configuration.

4.4 Reload Failure Handling

If configuration file parsing or validation fails, the reload is aborted and the existing configuration is preserved. Reload failures can be monitored via the prometheus_config_last_reload_successful metric:

prometheus_config_last_reload_successful == 0

5. Target Discovery Pipeline

5.1 Complete Flow

The full pipeline from target discovery to scraping:

Service Discovery Provider
        |
        v
Discovery Manager (target groups)
        |
        v
Scrape Manager (apply relabel_configs)
        |
        v
Scrape Pool (per job)
        |
        v
Scrape Loop (per target)
        |
        v
HTTP GET /metrics
        |
        v
TSDB Appender

5.2 Discovery Manager

The Discovery Manager unifies multiple service discovery providers:

kubernetes_sd_config --|
consul_sd_config    --|-->  Discovery Manager  -->  Target Groups Channel
file_sd_config      --|
static_config       --|

Each provider runs in a separate goroutine, delivering target group changes through channels. The Discovery Manager collects these and forwards them to the Scrape Manager.

Provider updates are buffered for a configurable duration (default 5 seconds) before being delivered in batch. This prevents frequent changes from overloading the Scrape Manager.

5.3 Scrape Manager

When the Scrape Manager receives target groups from the Discovery Manager:

Finds the corresponding Scrape Pool for each scrape job
Applies relabel_configs to new targets
Keeps or drops targets based on relabeling results
Creates Scrape Loops for new targets
Terminates Scrape Loops for disappeared targets

5.4 Scrape Pool

A Scrape Pool is a group of targets belonging to the same scrape job. Each Pool manages:

Active target list
Dropped target list (for debugging)
Shared scraping configuration (interval, timeout, metrics path, etc.)

5.5 Scrape Loop

One Scrape Loop runs as a goroutine for each target:

Scrape Loop Cycle:
1. Wait for scrape_interval
2. Send HTTP GET request (within scrape_timeout)
3. Parse response (Prometheus Exposition Format)
4. Apply metric_relabel_configs
5. Write samples to TSDB
6. Update up metric (1=success, 0=failure)
7. Update internal metrics (scrape_duration_seconds, etc.)
8. Repeat from step 1

6. HTTP Client Configuration and TLS

6.1 HTTP Client Construction

The HTTP client for each scrape target is configured according to the scrape job settings:

scrape_configs:
  - job_name: 'secure-targets'
    scheme: https
    tls_config:
      ca_file: /path/to/ca.pem
      cert_file: /path/to/cert.pem
      key_file: /path/to/key.pem
      insecure_skip_verify: false
    basic_auth:
      username: admin
      password_file: /path/to/password
    authorization:
      type: Bearer
      credentials_file: /path/to/token

6.2 TLS Configuration

Prometheus supports various TLS configurations:

CA Certificate: For server certificate validation
Client Certificate: For mTLS (mutual TLS) authentication
Server Name Verification: SNI configuration via the server_name field
Automatic Certificate Renewal: File-based credentials are periodically re-read

6.3 Authentication Mechanisms

Prometheus supports multiple authentication methods:

Basic Auth: Username/password based
Bearer Token: OAuth2 tokens, etc.
OAuth2 Client: Client credentials flow support
AWS SigV4: For AWS service access

Credential files (password_file, credentials_file, etc.) are re-read on each scrape cycle, so credentials can be updated without restarting Prometheus.

7. Internal Metrics

7.1 Key Self-Monitoring Metrics

Prometheus exposes various internal metrics for self-monitoring:

Scraping related:

prometheus_target_scrape_pool_targets: Number of targets in each scrape pool
prometheus_target_scrapes_exceeded_sample_limit_total: Sample limit exceeded count
scrape_duration_seconds: Scrape duration
scrape_samples_scraped: Number of samples scraped

TSDB related:

prometheus_tsdb_head_series: Number of series in the Head Block
prometheus_tsdb_head_samples_appended_total: Number of samples appended
prometheus_tsdb_compactions_total: Number of compactions performed
prometheus_tsdb_wal_corruptions_total: Number of WAL corruptions

Rule evaluation related:

prometheus_rule_evaluation_duration_seconds: Rule evaluation duration
prometheus_rule_group_last_duration_seconds: Last rule group evaluation time

Configuration related:

prometheus_config_last_reload_successful: Whether last reload succeeded
prometheus_config_last_reload_success_timestamp_seconds: Last successful reload timestamp

7.2 Performance Tuning Indicators

Key indicators to monitor in production:

1. Scrape lag: scrape_duration_seconds > scrape_interval
   - If scraping takes longer than the configured interval, data gaps may occur

2. Series cardinality: prometheus_tsdb_head_series
   - Rapid increase signals cardinality explosion

3. Rule evaluation lag: prometheus_rule_group_last_duration_seconds > evaluation_interval
   - Rule evaluation exceeding the interval causes alert delays

4. WAL size: prometheus_tsdb_wal_segment_current
   - Abnormally large WAL signals compaction issues

8. Prometheus Operator Integration

8.1 Prometheus Operator Architecture

In Kubernetes environments, Prometheus is typically deployed via Prometheus Operator:

Prometheus Operator
  |
  +-- watches Prometheus CRD
  |     |
  |     +-- generates prometheus.yml
  |     +-- manages StatefulSet
  |
  +-- watches ServiceMonitor CRD
  |     |
  |     +-- converts to scrape_configs
  |
  +-- watches PodMonitor CRD
  |     |
  |     +-- converts to scrape_configs
  |
  +-- watches PrometheusRule CRD
        |
        +-- converts to rule files

8.2 Config Reloader Sidecar

Prometheus Operator uses a config-reloader sidecar container to automatically detect configuration changes and send reload requests to Prometheus:

1. Operator updates ConfigMap/Secret
2. Config Reloader detects mounted file changes
3. Sends /-/reload POST request
4. Prometheus applies new configuration

This sidecar monitors file changes using inotify (Linux), with periodic polling as a fallback.

9. Performance Optimization Tips

9.1 Scraping Optimization

scrape_interval tuning: Set intervals matching metric change velocity. Too short increases load, too long reduces precision
sample_limit configuration: Limit maximum samples per target to prevent cardinality explosion
metric_relabel_configs: Drop unnecessary metrics immediately after scraping

9.2 TSDB Optimization

--storage.tsdb.min-block-duration: Minimum Head Block retention time (default 2h)
--storage.tsdb.max-block-duration: Maximum block size (default 10% of retention time or 31 days)
--storage.tsdb.wal-compression: Enable WAL compression to reduce disk I/O

9.3 Query Optimization

--query.max-concurrency: Limit concurrent queries (default 20)
--query.timeout: Set query timeout (default 2 minutes)
--query.max-samples: Limit maximum samples per query (default 50 million)

10. Summary

The Prometheus server's internal architecture demonstrates a clean design leveraging Go's concurrency model. The core design patterns are lifecycle management through Run Groups, channel-based inter-component communication, and pipeline-style data flow.

In the next post, we will analyze the TSDB internals in greater depth, examining WAL segment structure, chunk encoding, and block compaction algorithms at the source code level.