Skip to content

필사 모드: Cilium Hubble Observability Platform Internal Analysis

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Cilium Hubble Observability Platform Internal Analysis

Overview

Hubble is a network observability platform built into Cilium that collects and analyzes all network events from the eBPF datapath in real time. It provides deep network visibility without installing additional agents on the infrastructure.

1. Hubble Architecture

1.1 Component Layout

+-----------------------------------------------------------+

| Hubble Architecture |

+-----------------------------------------------------------+

| |

| Node 1 Node 2 Node 3 |

| +-----------+ +-----------+ +-----------+ |

| | Cilium | | Cilium | | Cilium | |

| | Agent | | Agent | | Agent | |

| | +------+ | | +------+ | | +------+ | |

| | |Hubble| | | |Hubble| | | |Hubble| | |

| | |Server| | | |Server| | | |Server| | |

| | +------+ | | +------+ | | +------+ | |

| +-----------+ +-----------+ +-----------+ |

| | | | |

| +--------+--------+--------+--------+ |

| | | |

| +-----v-----+ +-----v-----+ |

| | Hubble | | Hubble UI | |

| | Relay | | | |

| +-----------+ +-----------+ |

| | |

| +-----v-----+ |

| | Prometheus| |

| | Grafana | |

| +-----------+ |

+-----------------------------------------------------------+

1.2 Component Roles

| Component | Role | Deployment |

| ------------- | ---------------------------- | ------------------------- |

| Hubble Server | Collect flows on each node | Embedded in Cilium Agent |

| Hubble Relay | Aggregate flows cluster-wide | Deployment (1-2 replicas) |

| Hubble UI | Topology visualization | Deployment |

| Hubble CLI | Command-line flow queries | Local binary |

2. Hubble Server: Flow Collection Engine

2.1 Flow Collection Mechanism

The Hubble Server runs inside the Cilium Agent, collecting events from the eBPF datapath:

eBPF datapath events:

[Packet processing] -> [Policy verdict] -> [Conntrack events]

| | |

v v v

[Perf Event Ring Buffer]

|

v

[Hubble Server: Event parser]

|

v

[Flow ring buffer (in-memory)]

|

v

[gRPC server: Stream flows to clients]

2.2 Ring Buffer

Hubble stores flows in a fixed-size ring buffer:

Ring buffer size configuration

Default: 4095 flows

Config: --hubble-buffer-size=16383

Check current ring buffer status

hubble status

Nodes:

node-1: Connected, Flows: 4095/4095 (100.00%), ...

node-2: Connected, Flows: 3821/4095 (93.31%), ...

2.3 Flow Data Structure

Key fields in a flow record:

- time: Event timestamp

- source:

identity: Source Identity

namespace: Source namespace

labels: Source labels

pod_name: Source Pod name

- destination:

identity: Destination Identity

namespace: Destination namespace

labels: Destination labels

pod_name: Destination Pod name

- IP:

source: Source IP

destination: Destination IP

- l4:

TCP/UDP:

source_port: Source port

destination_port: Destination port

- l7:

type: HTTP/DNS/Kafka

http:

method: GET/POST/...

url: Request URL

code: Response code

- verdict: FORWARDED/DROPPED/AUDIT/REDIRECTED

- drop_reason: Drop reason (when verdict is DROPPED)

- Type: L3_L4/L7/TRACE/DROP

3. Hubble Relay: Cluster-Wide Observability

3.1 Relay Operation

Hubble Relay connects to all Hubble Servers in the cluster to aggregate flows:

Relay connection management:

1. Discover all Cilium Agents (Hubble Servers) in the cluster

2. Connect to each node's Hubble gRPC service

3. Distribute flow requests to all nodes

4. Aggregate responses and deliver to clients

5. Automatic reconnection on node add/remove

3.2 Relay Deployment

apiVersion: apps/v1

kind: Deployment

metadata:

name: hubble-relay

namespace: kube-system

spec:

replicas: 1

template:

spec:

containers:

- name: hubble-relay

image: quay.io/cilium/hubble-relay:v1.16.0

ports:

- containerPort: 4245

name: grpc

args:

- serve

- --peer-service=unix:///var/run/cilium/hubble.sock

- --listen-address=:4245

3.3 Relay Port Forwarding

Local port forwarding to Hubble Relay

cilium hubble port-forward &

Then access via hubble CLI

hubble observe

hubble status

4. Hubble UI: Topology Visualization

4.1 UI Features

Hubble UI is a web-based interface providing:

Key features:

1. Service Map

- Visualize service-to-service communication as topology graph

- Real-time traffic flow display

- Color-coded normal/abnormal connections

2. Flow Table

- Real-time network flow listing

- Filtering (namespace, labels, verdict)

- Detailed information for each flow

3. Policy Visualization

- Display allow/deny status from policies

- Highlight dropped traffic

5. Flow Types: L3/L4/L7

5.1 L3/L4 Flows

Collected by default for all network packets:

Observe L3/L4 flows

hubble observe --type l3/l4

Example output:

Mar 20 10:15:32.123 default/frontend-abc -> default/backend-xyz

TCP SYN 10.244.1.5:34567 -> 10.244.2.10:8080

verdict: FORWARDED

Track TCP connections

hubble observe --protocol tcp --to-port 8080

UDP traffic

hubble observe --protocol udp --to-port 53

5.2 L7 Flows

Collected for traffic with L7 policies applied:

Observe HTTP flows

hubble observe --type l7 --protocol http

Example output:

Mar 20 10:15:32.456 default/frontend-abc -> default/api-server-xyz

HTTP GET /api/v1/users

Response: 200 OK (23ms)

DNS flows

hubble observe --type l7 --protocol dns

Kafka flows

hubble observe --type l7 --protocol kafka

5.3 Drop Flows

Packets dropped due to policies or errors:

Observe only dropped flows

hubble observe --verdict DROPPED

Example output:

Mar 20 10:15:34.012 default/untrusted-app -> default/backend-xyz

TCP 10.244.3.5:45678 -> 10.244.2.10:8080

verdict: DROPPED (Policy denied)

Filter by specific drop reason

hubble observe --verdict DROPPED --drop-reason POLICY_DENIED

6. Hubble CLI Usage

6.1 Basic Commands

Observe all flows (real-time streaming)

hubble observe -f

Last 100 flows

hubble observe --last 100

Specific time range

hubble observe --since 5m

hubble observe --since "2026-03-20T10:00:00Z" --until "2026-03-20T10:30:00Z"

6.2 Filtering

Namespace filtering

hubble observe --namespace production

Pod name filtering

hubble observe --from-pod production/frontend-abc

hubble observe --to-pod production/backend-xyz

Label filtering

hubble observe --from-label "app=frontend"

hubble observe --to-label "app=backend"

IP filtering

hubble observe --from-ip 10.244.1.5

hubble observe --to-ip 10.244.2.10

Port filtering

hubble observe --to-port 8080

Verdict filtering

hubble observe --verdict FORWARDED

hubble observe --verdict DROPPED

Combined filters

hubble observe \

--namespace production \

--from-label "app=frontend" \

--to-label "app=backend" \

--to-port 8080 \

--verdict FORWARDED

6.3 Output Formats

Default output (human-readable)

hubble observe

JSON output

hubble observe -o json

Compact output

hubble observe -o compact

Dictionary output

hubble observe -o dict

jsonpb (Protocol Buffers JSON format)

hubble observe -o jsonpb

6.4 Status Check

Overall Hubble status

hubble status

Example output:

Healthcheck (via localhost:4245): Ok

Current/Max Flows: 16383/16383 (100.00%)

Flows/s: 245.32

Connected Nodes: 3/3

Per-node status

hubble list nodes

7. Hubble Prometheus Metrics

7.1 Key Metrics

| Metric Name | Description |

| ------------------------------------ | ----------------------------- |

| hubble_flows_processed_total | Total flows processed |

| hubble_drop_total | Dropped packets by reason |

| hubble_tcp_flags_total | Packets by TCP flag |

| hubble_dns_queries_total | DNS query count |

| hubble_dns_responses_total | DNS response count |

| hubble_http_requests_total | HTTP requests by method/path |

| hubble_http_responses_total | HTTP responses by status code |

| hubble_http_request_duration_seconds | HTTP request latency |

| hubble_icmp_total | ICMP packet count |

7.2 Alert Rule Examples

groups:

- name: hubble-alerts

rules:

- alert: HighDropRate

expr: rate(hubble_drop_total[5m]) > 100

for: 5m

labels:

severity: warning

annotations:

summary: 'High packet drop rate detected'

- alert: HTTPErrorRate

expr: |

rate(hubble_http_responses_total{status=~"5.."}[5m])

/ rate(hubble_http_responses_total[5m]) > 0.05

for: 5m

labels:

severity: critical

annotations:

summary: 'HTTP 5xx error rate exceeds 5%'

- alert: DNSLatencyHigh

expr: |

histogram_quantile(0.99, rate(hubble_dns_response_time_seconds_bucket[5m]))

> 0.5

for: 5m

labels:

severity: warning

annotations:

summary: 'DNS response latency p99 exceeds 500ms'

8. Hubble gRPC API

8.1 API Overview

Hubble provides programmatic access to flow data through its gRPC API:

// Hubble Observer API (simplified)

service Observer {

// Stream flows

rpc GetFlows(GetFlowsRequest) returns (stream GetFlowsResponse);

// Hubble server status

rpc ServerStatus(ServerStatusRequest) returns (ServerStatusResponse);

// Node list

rpc GetNodes(GetNodesRequest) returns (GetNodesResponse);

}

8.2 API Usage Example

Direct API call with gRPCurl

grpcurl -plaintext localhost:4245 observer.Observer/ServerStatus

Stream flows

grpcurl -plaintext -d '{}' localhost:4245 observer.Observer/GetFlows

8.3 Custom Integration Use Cases

Hubble gRPC API use cases:

1. Custom Dashboards

- Collect specific business metrics

- Custom visualizations

2. Automated Security Analysis

- Detect abnormal traffic patterns

- Automatic policy violation alerts

3. Audit Logging

- Record network activity for compliance

- Export to external systems for long-term storage

4. Service Mesh Integration

- Monitor service-to-service latency

- Error tracking and analysis

9. Performance Impact and Tuning

9.1 Performance Overhead

Overhead from enabling Hubble:

CPU:

- Basic L3/L4 observation: Minimal (~1-2%)

- L7 observation (HTTP, etc.): Depends on Envoy proxy overhead

- High-traffic environments: Flow parsing and ring buffer management

Memory:

- Proportional to ring buffer size

- Default 4095 flows x ~500 bytes per flow = ~2MB

- 16383 flows: ~8MB

Network:

- gRPC streaming traffic to Relay

- Proportional to number of observing clients

9.2 Tuning Parameters

Ring buffer size adjustment

--hubble-buffer-size=16383

Event queue size

--hubble-event-queue-size=0 # 0 = auto

Enable/disable specific metrics

--hubble-metrics=dns,drop,tcp,flow

Monitor specific events

--hubble-monitor-events="drop:true;trace:true;l7:true"

9.3 Large-Scale Optimization

Considerations for large clusters:

1. Relay resource limits

- Set appropriate CPU/memory requests/limits

- Relay load increases with node count

2. Metric cardinality management

- Minimize label contexts to control metric count

- Disable unnecessary metrics

3. Flow retention strategy

- Balance ring buffer size with traffic volume

- Export to external systems for long-term retention

4. Network bandwidth

- Consider gRPC traffic between Relay and Agents

- Limit impact of large observation queries

10. Hubble Usage Scenarios

10.1 Troubleshooting

Diagnose Pod-to-Pod connectivity issues

hubble observe --from-pod default/app-a --to-pod default/app-b

Analyze dropped traffic causes

hubble observe --verdict DROPPED --from-pod default/app-a

Check DNS resolution issues

hubble observe --type l7 --protocol dns --from-pod default/app-a

Check traffic to specific service

hubble observe --to-label "app=database" --to-port 5432

10.2 Security Audit

Monitor policy violation traffic

hubble observe --verdict DROPPED --drop-reason POLICY_DENIED

Track egress traffic to external

hubble observe --to-identity reserved:world

All ingress traffic to sensitive namespace

hubble observe --namespace sensitive-ns --traffic-direction ingress

Summary

Cilium Hubble provides network observability through these core capabilities:

- **Zero Instrumentation**: Automatic flow collection from eBPF without application modifications

- **Multi-Layer Visibility**: Observation across all layers from L3/L4 network to L7 application

- **Real-Time Streaming**: Real-time flow streaming via ring buffer and gRPC

- **Cluster-Wide Observation**: Cluster-wide data aggregation through Hubble Relay

- **Visualization**: Service topology maps and flow visualization through Hubble UI

- **Metrics Integration**: Time-series metrics and alerting via Prometheus/Grafana

- **Programmatic Access**: Custom tool integration through gRPC API

현재 단락 (1/265)

Hubble is a network observability platform built into Cilium that collects and analyzes all network ...

작성 글자: 0원문 글자: 10,622작성 단락: 0/265