- Authors
- Name
- 1. Introduction to Fluent Bit
- 2. Core Architecture: Pipeline Structure
- 3. Installation Methods
- 4. Configuration File Structure
- 5. Input Plugin Details
- 6. Parser Configuration
- 7. Filter Plugin Details
- 7.1 kubernetes: Pod Metadata Enrichment
- 7.2 modify: Add/Remove/Rename Fields
- 7.3 grep: Regex-Based Filtering
- 7.4 record_modifier: Record Modification
- 7.5 nest: Nested Structure Conversion
- 7.6 lua: Custom Transformation with Lua Scripts
- 7.7 rewrite_tag: Tag-Based Routing Changes
- 7.8 throttle: Log Rate Limiting
- 7.9 multiline: Multiline Log Merging
- 8. Output Plugin Details
- 9. Buffer and Backpressure Management
- 10. Kubernetes Integration Complete Guide
- 11. Practical Pipeline Examples
- 12. Performance Tuning
- 13. Monitoring and Observability
- 14. Troubleshooting Guide
- 15. Operational Best Practices
- 16. References
1. Introduction to Fluent Bit
1.1 What Is Fluent Bit?
Fluent Bit is an ultra-lightweight telemetry agent written in C that collects Logs, Metrics, and Traces from various sources, processes them, and delivers them to desired destinations. It boasts an extremely lightweight footprint with a binary size of approximately 450KB and memory usage under 1MB, making it widely applicable from embedded systems to large-scale Kubernetes clusters.
Fluent Bit is a CNCF (Cloud Native Computing Foundation) Graduated project, recognized at the same maturity level as Kubernetes, Prometheus, and Envoy. Under the Fluentd project umbrella, it achieved graduation status in 2019. As of 2024, it has been downloaded over 13 billion times on DockerHub, establishing itself as the de facto standard for cloud-native log collection.
In March 2024, Fluent Bit v3 was announced at KubeCon + CloudNativeCon EU. In December of the same year, v3.2 was released, followed by v4.0 in March 2025, continuously evolving with YAML standard configuration, Processor support, SIMD-based JSON encoding (2.5x performance improvement), and enhanced OpenTelemetry support.
1.2 Key Features
- Ultra-lightweight: C-based, ~450KB binary, under 1MB memory usage
- High performance: Asynchronous I/O, multi-threaded pipeline, SIMD optimization
- Plugin architecture: Over 100 Input/Filter/Output plugins
- Unified telemetry: Logs, Metrics, Traces handled by a single agent
- YAML native: YAML is the standard configuration format from v3.2
- Hot Reload: Configuration reload without service interruption (SIGHUP / HTTP API)
- Cross-platform: Supports Linux, macOS, Windows, BSD, and embedded Linux
1.3 Fluent Bit vs Fluentd Comparison
| Category | Fluent Bit | Fluentd |
|---|---|---|
| Language | C | Ruby + C |
| Binary Size | ~450KB | ~40MB |
| Memory Usage | ~1MB | ~30-40MB |
| Plugin Count | Over 100 (built-in) | Over 1,000 (including gems) |
| Performance | Very high | High |
| CPU Usage | Low | 4x compared to Fluent Bit |
| Config Format | INI / YAML | Ruby DSL |
| Primary Use | Edge/node-level collection | Centralized log aggregation |
| Kubernetes | DaemonSet per node | Aggregator central deploy |
| CNCF Status | Graduated (under Fluentd) | Graduated |
| Best For | IoT, containers, edge | Large-scale aggregation, complex routing |
Recommended architecture: The most widely used pattern is a hybrid approach where Fluent Bit is deployed as a DaemonSet on each node to collect logs, forwarding to a central Fluentd Aggregator when needed. However, as Fluent Bit capabilities continue to strengthen, cases where the entire pipeline is built with Fluent Bit alone without Fluentd are rapidly increasing.
1.4 Architecture Overview
+------------------------------------------------------------------+
| Fluent Bit Engine |
| |
| +--------+ +--------+ +--------+ +--------+ +--------+ |
| | Input |-->| Parser |-->| Filter |-->| Buffer |-->| Output | |
| +--------+ +--------+ +--------+ +--------+ +--------+ |
| | tail | | json | | k8s | | memory | | es | |
| | systemd| | regex | | grep | | filesys| | loki | |
| | forward| | logfmt | | modify | | | | s3 | |
| | http | | cri | | lua | | | | kafka | |
| | tcp | | docker | | nest | | | | stdout | |
| +--------+ +--------+ +--------+ +--------+ +--------+ |
| |
| [Scheduler] [Router / Tag Matching] [HTTP Server / Monitoring]|
+------------------------------------------------------------------+
Fluent Bit data processing follows a Pipeline structure where each stage is clearly separated. This structure enables independent extension and replacement on a per-module basis.
2. Core Architecture: Pipeline Structure
2.1 Full Pipeline Flow
The Fluent Bit data processing pipeline consists of the following stages:
[Data Source]
|
v
+---------+ +---------+ +---------+ +---------+ +---------+
| INPUT | --> | PARSER | --> | FILTER | --> | BUFFER | --> | OUTPUT |
| | | | | | | | | |
| Data | | Unstruc | | Data | | Memory/ | | Final |
| Collect | | -> Struc| | Process | | Disk | | Deliver |
+---------+ +---------+ +---------+ +---------+ +---------+
| | | | |
Tag assign Structuring Enrichment Reliability Destination
conversion Filtering guarantee (Tag match)
2.2 Role of Each Stage
Input
The entry point for data. It collects data from various sources such as files, system journals, network sockets, and Kubernetes events. Every input data item is assigned a Tag, which is used for subsequent routing.
Parser
Converts unstructured text data into structured data. It provides various parsers including JSON, Regex, Logfmt, Docker, and CRI, applied at the Input plugin stage.
Filter
The stage for processing collected data. It performs field addition/deletion, Kubernetes metadata enrichment, regex-based filtering, and Lua script transformations. Multiple Filters can be chained to compose complex transformation logic.
Buffer
Data that passes through Filters is stored in a buffer before being delivered to Output. Two buffer modes are supported: memory buffer and filesystem buffer. Using the filesystem buffer prevents data loss even during failures.
Router
Routes data to the appropriate Output based on Tag matching rules. It supports wildcard (*) matching and can simultaneously deliver a single input to multiple Outputs (fan-out).
Output
Transmits data to the final destination. It supports various backends including Elasticsearch, Loki, S3, Kafka, CloudWatch, and Prometheus. An automatic retry mechanism activates upon transmission failure.
2.3 Multi-Pipeline Structure
Fluent Bit can operate multiple independent pipelines simultaneously within a single instance. Each pipeline has its own unique combination of Input, Filter, and Output, with Tag-based routing separating different data flows.
Pipeline A: [tail: app-*.log] --tag:app--> [filter:k8s] --> [output:elasticsearch]
Pipeline B: [tail: sys-*.log] --tag:sys--> [filter:grep] --> [output:loki]
Pipeline C: [forward:24224] --tag:fwd--> [filter:lua] --> [output:s3]
This structure provides the following benefits:
- Isolation: Independent operation between pipelines prevents failure propagation
- Flexibility: Different processing logic and destinations per use case
- Efficiency: A single agent handles multiple data flows
3. Installation Methods
3.1 Linux (apt/yum)
Ubuntu/Debian (apt)
# Add GPG key and repository
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
# Or manual installation
wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -
echo "deb https://packages.fluentbit.io/ubuntu/$(lsb_release -cs) $(lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/fluent-bit.list
sudo apt-get update
sudo apt-get install -y fluent-bit
# Start service
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit
CentOS/RHEL (yum)
cat > /etc/yum.repos.d/fluent-bit.repo << 'EOF'
[fluent-bit]
name=Fluent Bit
baseurl=https://packages.fluentbit.io/centos/$releasever/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1
EOF
sudo yum install -y fluent-bit
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit
3.2 Docker
# Run latest version
docker run -ti cr.fluentbit.io/fluent/fluent-bit:latest
# Mount configuration file
docker run -ti \
-v /path/to/fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml \
-v /var/log:/var/log \
cr.fluentbit.io/fluent/fluent-bit:latest \
/fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.yaml
# Docker Compose example
cat > docker-compose.yaml << 'EOF'
version: '3.8'
services:
fluent-bit:
image: cr.fluentbit.io/fluent/fluent-bit:latest
volumes:
- ./fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml
- /var/log:/var/log:ro
ports:
- "2020:2020" # HTTP monitoring
- "24224:24224" # Forward protocol
EOF
3.3 macOS (Homebrew)
brew install fluent-bit
# Run
fluent-bit -c /opt/homebrew/etc/fluent-bit/fluent-bit.conf
# Or run with YAML configuration file
fluent-bit -c /path/to/fluent-bit.yaml
3.4 Kubernetes (Helm Chart)
# Add Helm repository
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
# Basic installation
helm install fluent-bit fluent/fluent-bit \
--namespace logging \
--create-namespace
# Install with custom values.yaml
helm install fluent-bit fluent/fluent-bit \
--namespace logging \
--create-namespace \
-f custom-values.yaml
# Upgrade
helm upgrade fluent-bit fluent/fluent-bit \
--namespace logging \
-f custom-values.yaml
3.5 Direct Binary Installation
# Download binary from GitHub Releases
FLUENT_BIT_VERSION=3.2.2
wget https://github.com/fluent/fluent-bit/releases/download/v${FLUENT_BIT_VERSION}/fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz
tar xzf fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz
cd fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64/
# Run
./bin/fluent-bit -c conf/fluent-bit.yaml
# Check version
./bin/fluent-bit --version
4. Configuration File Structure
Fluent Bit supports two configuration formats: Classic mode (INI format) and YAML mode. YAML has been the standard configuration format since v3.2, and Classic mode is scheduled for deprecation by the end of 2025.
4.1 Classic Mode (fluent-bit.conf)
# fluent-bit.conf - Classic INI format
[SERVICE]
Flush 5
Daemon Off
Log_Level info
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Hot_Reload On
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser cri
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
DB /var/log/flb_kube.db
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
[OUTPUT]
Name es
Match kube.*
Host elasticsearch.logging.svc.cluster.local
Port 9200
Logstash_Format On
Logstash_Prefix kube
Retry_Limit False
4.2 YAML Mode (fluent-bit.yaml)
# fluent-bit.yaml - YAML format (v3.2+ standard)
service:
flush: 5
daemon: off
log_level: info
parsers_file: parsers.conf
http_server: on
http_listen: 0.0.0.0
http_port: 2020
hot_reload: on
pipeline:
inputs:
- name: tail
path: /var/log/containers/*.log
parser: cri
tag: kube.*
mem_buf_limit: 5MB
skip_long_lines: on
refresh_interval: 10
db: /var/log/flb_kube.db
filters:
- name: kubernetes
match: kube.*
kube_url: https://kubernetes.default.svc:443
kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
merge_log: on
k8s-logging.parser: on
k8s-logging.exclude: on
outputs:
- name: es
match: kube.*
host: elasticsearch.logging.svc.cluster.local
port: 9200
logstash_format: on
logstash_prefix: kube
retry_limit: false
4.3 Comparing the Two Formats
| Category | Classic (INI) | YAML |
|---|---|---|
| File Extension | .conf | .yaml / .yml |
| Status | Deprecation planned (end 2025) | Standard (v3.2+) |
| Processor Support | Not supported | Supported |
| Readability | Moderate | Excellent |
| Nested Structures | Limited | Fully supported |
| Arrays/Lists | Not supported | Supported |
| Comments | # | # |
4.4 Using Environment Variables
Both formats can reference environment variables using ${ENV_VAR} syntax.
# YAML environment variable example
pipeline:
outputs:
- name: es
match: '*'
host: ${ELASTICSEARCH_HOST}
port: ${ELASTICSEARCH_PORT}
http_user: ${ES_USER}
http_passwd: ${ES_PASSWORD}
tls: ${ES_TLS_ENABLED}
# Classic environment variable example
[OUTPUT]
Name es
Match *
Host ${ELASTICSEARCH_HOST}
Port ${ELASTICSEARCH_PORT}
HTTP_User ${ES_USER}
HTTP_Passwd ${ES_PASSWORD}
From v4.0, the file:// prefix can be used to securely reference secret values from the filesystem.
pipeline:
outputs:
- name: es
http_passwd: file:///run/secrets/es-password
4.5 @INCLUDE Directive
Configuration files can be modularized for management.
# fluent-bit.conf (Classic)
[SERVICE]
Flush 5
@INCLUDE inputs.conf
@INCLUDE filters.conf
@INCLUDE outputs.conf
In YAML, use the includes section.
# fluent-bit.yaml
includes:
- inputs.yaml
- filters.yaml
- outputs.yaml
service:
flush: 5
5. Input Plugin Details
Input plugins are the starting point for data collection. Each Input is assigned a unique Tag that serves as the matching criterion for subsequent Filters and Outputs.
5.1 tail: File Log Collection
The most commonly used Input plugin, reading new lines in real time from the end of a file like tail -f.
pipeline:
inputs:
- name: tail
tag: app.logs
path: /var/log/app/*.log
path_key: filename # Include file path in records
exclude_path: /var/log/app/debug.log # Exclude specific files
parser: json # Default parser
db: /var/log/flb_app.db # Offset storage DB (resume after restart)
db.sync: normal # DB sync mode
refresh_interval: 10 # File list refresh interval (seconds)
read_from_head: false # true: read from beginning of file
skip_long_lines: on # Skip very long lines
mem_buf_limit: 5MB # Memory buffer limit
rotate_wait: 5 # Rotation wait time (seconds)
multiline.parser: docker, cri # Multiline parser
Key Configuration Options
| Setting | Default | Description |
|---|---|---|
Path | (required) | Log file path (supports wildcards) |
Path_Key | - | Add file path as a key to records |
Exclude_Path | - | File paths to exclude |
DB | - | SQLite DB path for file offset storage |
Refresh_Interval | 60 | File list refresh interval (seconds) |
Read_from_Head | false | Whether to read from file beginning |
Skip_Long_Lines | Off | Skip lines exceeding Buffer_Max_Size |
Mem_Buf_Limit | - | Memory buffer limit |
Rotate_Wait | 5 | Wait time after log rotation |
5.2 systemd: systemd Journal Collection
pipeline:
inputs:
- name: systemd
tag: host.systemd
systemd_filter: _SYSTEMD_UNIT=docker.service
systemd_filter: _SYSTEMD_UNIT=kubelet.service
read_from_tail: on
strip_underscores: on
db: /var/log/flb_systemd.db
5.3 forward: Fluentd Protocol Reception
pipeline:
inputs:
- name: forward
tag: forward.incoming
listen: 0.0.0.0
port: 24224
buffer_chunk_size: 1M
buffer_max_size: 6M
5.4 http / tcp / udp: Network Reception
pipeline:
inputs:
# HTTP reception
- name: http
tag: http.logs
listen: 0.0.0.0
port: 9880
successful_response_code: 201
# TCP reception
- name: tcp
tag: tcp.logs
listen: 0.0.0.0
port: 5170
format: json
# UDP reception
- name: udp
tag: udp.logs
listen: 0.0.0.0
port: 5170
format: json
5.5 kubernetes_events: K8s Event Collection
pipeline:
inputs:
- name: kubernetes_events
tag: kube.events
kube_url: https://kubernetes.default.svc:443
kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
interval_sec: 1
retention_time: 1h
5.6 node_exporter_metrics / prometheus_scrape
pipeline:
inputs:
# Node metrics collection
- name: node_exporter_metrics
tag: node.metrics
scrape_interval: 30
# Prometheus endpoint scraping
- name: prometheus_scrape
tag: prom.metrics
host: 127.0.0.1
port: 9090
metrics_path: /metrics
scrape_interval: 10s
5.7 fluentbit_metrics: Internal Metrics
pipeline:
inputs:
- name: fluentbit_metrics
tag: fb.metrics
scrape_interval: 30
scrape_on_start: true
6. Parser Configuration
Parsers are core components that convert unstructured text logs into structured data. They are defined in a separate parsers.conf or YAML file.
6.1 Built-in Parsers
Fluent Bit provides built-in parsers for commonly used log formats.
| Parser | Format | Use Case |
|---|---|---|
json | JSON | JSON format logs |
docker | JSON (Docker specific) | Docker container logs |
cri | Regex | CRI (containerd) logs |
syslog-rfc5424 | Regex | RFC 5424 Syslog |
syslog-rfc3164 | Regex | RFC 3164 Syslog |
apache | Regex | Apache access logs |
nginx | Regex | Nginx access logs |
logfmt | Logfmt | key=value pair format |
6.2 Writing Custom Regex Parsers
# parsers.conf
# Nginx error log parser
[PARSER]
Name nginx_error
Format regex
Regex ^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$
Time_Key time
Time_Format %Y/%m/%d %H:%M:%S
# Spring Boot log parser
[PARSER]
Name spring_boot
Format regex
Regex ^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
# Apache Combined log parser
[PARSER]
Name apache_combined
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
The same can be defined in YAML format.
# parsers.yaml
parsers:
- name: nginx_error
format: regex
regex: '^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$'
time_key: time
time_format: '%Y/%m/%d %H:%M:%S'
- name: spring_boot
format: regex
regex: '^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$'
time_key: time
time_format: '%Y-%m-%dT%H:%M:%S.%L'
6.3 Multiline Parsers
Merges logs spanning multiple lines, such as Java Stack Traces, into a single record.
# Multiline parser definition
[MULTILINE_PARSER]
Name java_stacktrace
Type regex
Flush_Timeout 1000
# First line pattern: starts with timestamp
Rule "start_state" "/^\d{4}-\d{2}-\d{2}/" "cont"
# Continuation line pattern: starts with whitespace or Caused by
Rule "cont" "/^\s+|^Caused by:/" "cont"
[MULTILINE_PARSER]
Name python_traceback
Type regex
Flush_Timeout 1000
Rule "start_state" "/^Traceback/" "python_tb"
Rule "python_tb" "/^\s+/" "python_tb"
Rule "python_tb" "/^\w+Error/" "end"
How to apply a multiline parser in Input:
pipeline:
inputs:
- name: tail
tag: app.java
path: /var/log/app/application.log
multiline.parser: java_stacktrace
read_from_head: true
6.4 Time_Key and Time_Format
Extracts the timestamp from log messages and uses it as the record time.
[PARSER]
Name custom_time
Format regex
Regex ^(?<time>[^ ]+) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
Time_Keep On # Keep the time field after conversion
Time_Offset +0900 # KST timezone
6.5 Parser Testing and Debugging
The easiest way to verify parser behavior is to use the stdout Output.
# Configuration for parser testing
service:
flush: 1
log_level: debug
pipeline:
inputs:
- name: tail
tag: test
path: /tmp/test.log
parser: my_custom_parser
outputs:
- name: stdout
match: test
format: json_lines
# Generate test log and verify
echo '2026-03-01T12:00:00.000Z ERROR [main] App - Connection failed' >> /tmp/test.log
fluent-bit -c test.yaml
7. Filter Plugin Details
Filter plugins are the intermediate stage for processing collected data. Multiple Filters can be chained in order to compose complex transformation pipelines.
7.1 kubernetes: Pod Metadata Enrichment
The most important filter in Kubernetes environments, automatically adding metadata such as Pod name, Namespace, Labels, and Annotations to container logs.
pipeline:
filters:
- name: kubernetes
match: kube.*
kube_url: https://kubernetes.default.svc:443
kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
merge_log: on # Merge JSON logs to top level
merge_log_key: log_parsed # Merge key name
keep_log: off # Remove original log field
k8s-logging.parser: on # Use parser settings from Pod annotations
k8s-logging.exclude: on # Exclude logs via Pod annotations
labels: on # Include Pod Labels
annotations: off # Whether to include Pod Annotations
buffer_size: 0 # API response buffer (0=unlimited)
kube_meta_cache_ttl: 300 # Metadata cache TTL (seconds)
use_kubelet: false # Whether to use kubelet API
Example record structure after enrichment:
{
"log": "Connection established to database",
"kubernetes": {
"pod_name": "api-server-7d4b8c6f5-x2k9j",
"namespace_name": "production",
"pod_id": "abc-123-def",
"container_name": "api-server",
"container_image": "myregistry/api-server:v2.1.0",
"labels": {
"app": "api-server",
"version": "v2.1.0",
"team": "backend"
},
"host": "node-01"
}
}
7.2 modify: Add/Remove/Rename Fields
pipeline:
filters:
- name: modify
match: "*"
# Add fields
add: environment production
add: cluster_name main-cluster
# Rename fields
rename: log message
rename: stream source
# Remove fields
remove: unwanted_field
# Conditional field addition (only when key doesn't exist)
set: default_level INFO
# Hard copy
copy: source source_backup
7.3 grep: Regex-Based Filtering
Passes only records matching specific patterns or excludes them.
pipeline:
filters:
# Pass only ERROR or WARN levels
- name: grep
match: app.*
regex: level (ERROR|WARN)
# Exclude healthcheck paths
- name: grep
match: access.*
exclude: path /health
# Pass only specific namespaces
- name: grep
match: kube.*
regex: $kubernetes['namespace_name'] ^(production|staging)$
# Combine multiple conditions (AND)
- name: grep
match: app.*
regex: level ERROR
regex: message .*timeout.*
7.4 record_modifier: Record Modification
pipeline:
filters:
- name: record_modifier
match: "*"
record: hostname ${HOSTNAME}
record: service_name my-application
remove_key: unnecessary_field
allowlist_key: timestamp
allowlist_key: level
allowlist_key: message
7.5 nest: Nested Structure Conversion
Converts flat structures to nested structures or vice versa.
pipeline:
filters:
# Nest: flat -> nested
- name: nest
match: '*'
operation: nest
wildcard: 'app_*'
nest_under: application
# app_name, app_version -> application: { name, version }
# Lift: nested -> flat
- name: nest
match: '*'
operation: lift
nested_under: kubernetes
# kubernetes: { pod_name, namespace } -> pod_name, namespace
7.6 lua: Custom Transformation with Lua Scripts
The most flexible transformation method, capable of implementing complex logic with Lua scripts.
pipeline:
filters:
- name: lua
match: app.*
script: /fluent-bit/scripts/transform.lua
call: process_log
-- /fluent-bit/scripts/transform.lua
function process_log(tag, timestamp, record)
-- Normalize log level
if record["level"] then
record["level"] = string.upper(record["level"])
end
-- Mask sensitive information
if record["message"] then
record["message"] = string.gsub(
record["message"],
"%d%d%d%d%-%d%d%d%d%-%d%d%d%d%-%d%d%d%d",
"****-****-****-****"
)
end
-- Add grade based on response time
if record["response_time"] then
local rt = tonumber(record["response_time"])
if rt > 5000 then
record["performance"] = "critical"
elseif rt > 1000 then
record["performance"] = "slow"
else
record["performance"] = "normal"
end
end
-- Add timestamp field
record["processed_at"] = os.date("!%Y-%m-%dT%H:%M:%SZ")
-- 2 = MODIFIED
return 2, timestamp, record
end
Lua callback function return values:
| Code | Meaning |
|---|---|
| -1 | Drop record |
| 0 | Keep original |
| 1 | Timestamp only modified |
| 2 | Record modified |
7.7 rewrite_tag: Tag-Based Routing Changes
Dynamically changes Tags based on record content to route to different Outputs.
pipeline:
filters:
- name: rewrite_tag
match: kube.*
rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
rule: $kubernetes['namespace_name'] ^(staging)$ stg.$TAG false
rule: $level ^(ERROR)$ alert.$TAG false
Rule syntax: rule: $KEY REGEX NEW_TAG KEEP_ORIGINAL
$KEY: Field to matchREGEX: Regex patternNEW_TAG: New TagKEEP_ORIGINAL: Whether to keep the original (true/false)
7.8 throttle: Log Rate Limiting
Protects the system by limiting excessive log generation.
pipeline:
filters:
- name: throttle
match: app.*
rate: 1000 # Allowed per window
window: 5 # Window size (seconds)
interval: 1s # Evaluation interval
print_status: true # Print status
7.9 multiline: Multiline Log Merging
Merges multiline logs at the Filter stage (separate from Input multiline.parser).
pipeline:
filters:
- name: multiline
match: app.*
multiline.parser: java_stacktrace
multiline.key_content: log
8. Output Plugin Details
Output plugins transmit processed data to final destinations. A single Fluent Bit instance can use multiple Outputs simultaneously.
8.1 elasticsearch / opensearch
pipeline:
outputs:
# Elasticsearch
- name: es
match: kube.*
host: ${ES_HOST}
port: 9200
index: logs
type: _doc
http_user: ${ES_USER}
http_passwd: ${ES_PASSWORD}
logstash_format: on
logstash_prefix: kube-logs
logstash_dateformat: %Y.%m.%d
time_key: '@timestamp'
include_tag_key: true
tag_key: fluentbit_tag
generate_id: on # Generate deduplication ID
buffer_size: 512KB
tls: on
tls.verify: on
tls.ca_file: /certs/ca.pem
retry_limit: 5
workers: 2
suppress_type_name: on # ES 8.x compatibility
# OpenSearch
- name: opensearch
match: app.*
host: opensearch.logging.svc
port: 9200
index: app-logs
http_user: admin
http_passwd: ${OPENSEARCH_PASSWORD}
tls: on
suppress_type_name: on
trace_output: off
Key Configuration Options
| Setting | Default | Description |
|---|---|---|
Host | 127.0.0.1 | Elasticsearch host |
Port | 9200 | Port number |
Index | fluent-bit | Index name |
Logstash_Format | Off | Use date-based indexing |
Logstash_Prefix | logstash | Index prefix |
HTTP_User | - | Basic Auth username |
HTTP_Passwd | - | Basic Auth password |
TLS | Off | Enable TLS |
Generate_ID | Off | Auto-generate document ID |
Workers | 0 | Number of parallel Workers |
Retry_Limit | 1 | Retry count (False=unlimited) |
Suppress_Type_Name | Off | Disable ES 8.x _type |
8.2 loki: Grafana Loki Integration
pipeline:
outputs:
- name: loki
match: kube.*
host: loki-gateway.logging.svc
port: 3100
uri: /loki/api/v1/push
tenant_id: my-tenant
labels: job=fluent-bit
label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
label_map_path: /fluent-bit/etc/loki-labelmap.json
remove_keys: kubernetes,stream
auto_kubernetes_labels: on
line_format: json
drop_single_key: on
http_user: ${LOKI_USER}
http_passwd: ${LOKI_PASSWORD}
tls: on
tls.verify: on
workers: 2
Loki Label Map File Example
{
"kubernetes": {
"namespace_name": "namespace",
"pod_name": "pod",
"container_name": "container",
"labels": {
"app": "app"
}
},
"stream": "stream"
}
8.3 s3: AWS S3 Storage
pipeline:
outputs:
- name: s3
match: archive.*
region: ap-northeast-2
bucket: my-log-bucket
s3_key_format: /logs/$TAG/%Y/%m/%d/%H/$UUID.gz
s3_key_format_tag_delimiters: .
total_file_size: 50M # Upload when this size is reached
upload_timeout: 10m # Upload after this time even if size not met
use_put_object: on
compression: gzip
content_type: application/gzip
store_dir: /tmp/fluent-bit-s3 # Local buffer directory
store_dir_limit_size: 512M
retry_limit: 5
# When using IAM roles (IRSA / EKS Pod Identity)
role_arn: arn:aws:iam::123456789012:role/fluent-bit-s3
# Static credentials (not recommended)
# endpoint:
# sts_endpoint:
8.4 kafka: Kafka Integration
pipeline:
outputs:
- name: kafka
match: app.*
brokers: kafka-0:9092,kafka-1:9092,kafka-2:9092
topics: application-logs
timestamp_key: '@timestamp'
timestamp_format: iso8601
format: json
message_key: log
queue_full_retries: 10
rdkafka.request.required.acks: 1
rdkafka.log.connection.close: false
rdkafka.compression.codec: snappy
8.5 cloudwatch_logs: AWS CloudWatch
pipeline:
outputs:
- name: cloudwatch_logs
match: kube.*
region: ap-northeast-2
log_group_name: /eks/my-cluster/containers
log_group_template: /eks/my-cluster/$kubernetes['namespace_name']
log_stream_prefix: from-fluent-bit-
log_stream_template: $kubernetes['pod_name']
auto_create_group: on
extra_user_agent: fluent-bit
retry_limit: 5
workers: 1
8.6 stdout: Standard Output for Debugging
pipeline:
outputs:
- name: stdout
match: '*'
format: json_lines # json_lines | msgpack
8.7 forward: Forward to Fluentd
pipeline:
outputs:
- name: forward
match: '*'
host: fluentd-aggregator.logging.svc
port: 24224
time_as_integer: off
send_options: true
require_ack_response: true
# TLS
tls: on
tls.verify: on
tls.ca_file: /certs/ca.pem
# Shared key authentication
shared_key: my-shared-secret
self_hostname: fluent-bit-node01
8.8 prometheus_exporter: Metrics Exposure
pipeline:
outputs:
- name: prometheus_exporter
match: metrics.*
host: 0.0.0.0
port: 2021
add_label: app fluent-bit
9. Buffer and Backpressure Management
In large-scale environments, log generation rates may exceed transmission rates. Properly configuring Fluent Bit Buffer and Backpressure management ensures stable operation without data loss.
9.1 Memory Buffer vs Filesystem Buffer
| Category | Memory Buffer | Filesystem Buffer |
|---|---|---|
| Storage Location | RAM | Disk + RAM (hybrid) |
| Speed | Very fast | Relatively slower |
| Data Safety | Lost on process exit | Preserved on disk |
| Capacity Limit | Limited by RAM size | Expandable to disk size |
| Config Complexity | Simple | Additional config required |
| Best For | Development/testing | Production |
9.2 Service-Level Configuration
service:
flush: 1
log_level: info
# Enable filesystem buffer
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal # normal | full
storage.checksum: off # Data integrity verification
storage.max_chunks_up: 128 # Max chunks to keep in memory
storage.backlog.mem_limit: 5M # Backlog memory limit
storage.metrics: on # Enable storage metrics
9.3 Input-Level Buffer Configuration
pipeline:
inputs:
# Memory buffer (default)
- name: tail
tag: app.mem
path: /var/log/app/*.log
mem_buf_limit: 10MB # Memory buffer limit
# Filesystem buffer
- name: tail
tag: app.fs
path: /var/log/app/*.log
storage.type: filesystem # filesystem | memory
storage.pause_on_chunks_overlimit: off # Continue writing to disk when limit exceeded
9.4 Output-Level Buffer Configuration
pipeline:
outputs:
- name: es
match: '*'
host: elasticsearch
port: 9200
storage.total_limit_size: 1G # Per-Output filesystem buffer total limit
retry_limit: 10 # Retry count
# retry_limit: false # Unlimited retries
9.5 Backpressure Mechanism
Backpressure occurs when the Output cannot transmit data fast enough. Fluent Bit handles Backpressure in the following stages:
[Input: Data Collection]
|
v
[Store in Memory Buffer]
|
Memory limit reached?
/ \
No Yes
| |
v v
[Continue] storage.type=filesystem?
/ \
No Yes
| |
v v
[Pause Input] [Write to Disk]
(Risk of data loss) |
Disk limit reached?
/ \
No Yes
| |
v v
[Continue] [Pause Input]
Recommended Backpressure Configuration (Production)
service:
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal
storage.max_chunks_up: 128
storage.backlog.mem_limit: 10M
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
storage.type: filesystem
storage.pause_on_chunks_overlimit: off
outputs:
- name: es
match: kube.*
host: elasticsearch
storage.total_limit_size: 5G
retry_limit: false
net.keepalive: on
net.keepalive_idle_timeout: 30
9.6 Mem_Buf_Limit vs storage.max_chunks_up
| Setting | Scope | Buffer Mode | Behavior |
|---|---|---|---|
Mem_Buf_Limit | Per Input | memory only | Pauses Input when limit exceeded |
storage.max_chunks_up | Global (SERVICE) | filesystem | Limits chunks in memory |
storage.total_limit_size | Per Output | filesystem | Limits total disk buffer |
Note: When using storage.type: filesystem, Mem_Buf_Limit has no effect. Instead, storage.max_chunks_up controls the number of chunks in memory.
10. Kubernetes Integration Complete Guide
10.1 DaemonSet Deployment Architecture
+---------------------------------------------------------------------+
| Kubernetes Cluster |
| |
| +-------------------+ +-------------------+ +-------------------+|
| | Node 1 | | Node 2 | | Node 3 ||
| | | | | | ||
| | +------+ +------+ | | +------+ +------+ | | +------+ +------+ ||
| | |Pod A | |Pod B | | | |Pod C | |Pod D | | | |Pod E | |Pod F | ||
| | +--+---+ +--+---+ | | +--+---+ +--+---+ | | +--+---+ +--+---+ ||
| | | | | | | | | | | | ||
| | v v | | v v | | v v ||
| | /var/log/containers| | /var/log/containers| | /var/log/containers||
| | | | | | | | | ||
| | +----+-----+ | | +----+-----+ | | +----+-----+ ||
| | |Fluent Bit| | | |Fluent Bit| | | |Fluent Bit| ||
| | |(DaemonSet)| | | |(DaemonSet)| | | |(DaemonSet)| ||
| | +----+-----+ | | +----+-----+ | | +----+-----+ ||
| +---------|----------+ +---------|----------+ +---------|----------+|
| | | | |
| +----------+------------+----------+------------+ |
| | | |
| +-------v--------+ +--------v--------+ |
| | Elasticsearch | | Grafana Loki | |
| +----------------+ +-----------------+ |
+---------------------------------------------------------------------+
10.2 Helm Chart Installation
# Add repository
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
# Basic installation
helm install fluent-bit fluent/fluent-bit \
--namespace logging \
--create-namespace
# Verify installation
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl get ds -n logging
10.3 values.yaml Customization
# custom-values.yaml
kind: DaemonSet
image:
repository: cr.fluentbit.io/fluent/fluent-bit
tag: '3.2.2'
pullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
tolerations:
- operator: Exists # Deploy on all nodes (including master)
serviceAccount:
create: true
annotations:
# IRSA (EKS)
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/fluent-bit
# Volume mounts
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: etcmachineid
mountPath: /etc/machine-id
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: etcmachineid
hostPath:
path: /etc/machine-id
# Environment variables
env:
- name: ELASTICSEARCH_HOST
value: 'elasticsearch-master.logging.svc.cluster.local'
- name: ELASTICSEARCH_PORT
value: '9200'
# Fluent Bit configuration
config:
service: |
[SERVICE]
Flush 5
Log_Level info
Daemon off
Parsers_File /fluent-bit/etc/parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
storage.path /var/log/fluent-bit/buffer/
storage.sync normal
storage.max_chunks_up 128
Hot_Reload On
inputs: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
multiline.parser docker, cri
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
storage.type filesystem
filters: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On
Labels On
Annotations Off
outputs: |
[OUTPUT]
Name es
Match kube.*
Host ${ELASTICSEARCH_HOST}
Port ${ELASTICSEARCH_PORT}
Logstash_Format On
Logstash_Prefix kube
Retry_Limit False
Suppress_Type_Name On
# ServiceMonitor for Prometheus metrics collection
serviceMonitor:
enabled: true
interval: 30s
scrapeTimeout: 10s
# Liveness/Readiness Probe
livenessProbe:
httpGet:
path: /
port: http
readinessProbe:
httpGet:
path: /api/v1/health
port: http
10.4 Container Log Paths and CRI Parsing
In Kubernetes, container logs are stored at the following paths:
/var/log/containers/<pod-name>_<namespace>_<container-name>-<container-id>.log
-> symlink -> /var/log/pods/<namespace>_<pod-name>_<pod-uid>/<container-name>/0.log
CRI Log Format (containerd / CRI-O)
2026-03-01T12:00:00.123456789Z stdout F This is a log message
2026-03-01T12:00:00.123456789Z stderr P This is a partial log line
| Field | Description |
|---|---|
2026-03-01T12:00:00.123456789Z | RFC 3339 nanosecond timestamp |
stdout / stderr | Stream type |
F / P | Full / Partial log flag |
| Remainder | Log message body |
Fluent Bit automatically detects and parses Docker format and CRI format when multiline.parser: docker, cri is specified.
10.5 Per-Namespace Log Routing
# Route to different Outputs based on Namespace
pipeline:
filters:
- name: kubernetes
match: kube.*
merge_log: on
labels: on
# Change tags for production namespace
- name: rewrite_tag
match: kube.*
rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
rule: $kubernetes['namespace_name'] ^(staging)$ stg.$TAG false
rule: $kubernetes['namespace_name'] ^(monitoring)$ mon.$TAG false
outputs:
# Production logs -> Elasticsearch
- name: es
match: prod.*
host: es-production.logging.svc
port: 9200
logstash_format: on
logstash_prefix: prod-logs
# Staging logs -> Loki
- name: loki
match: stg.*
host: loki.logging.svc
port: 3100
labels:
env=staging
# Monitoring logs -> S3 archiving
- name: s3
match: mon.*
bucket: monitoring-logs-archive
region: ap-northeast-2
total_file_size: 100M
upload_timeout: 10m
compression: gzip
# Other logs -> Default Elasticsearch
- name: es
match: kube.*
host: es-default.logging.svc
port: 9200
logstash_format: on
logstash_prefix: default-logs
10.6 Multi-Tenant Log Separation
pipeline:
filters:
- name: kubernetes
match: kube.*
merge_log: on
labels: on
# Tag rewriting based on team Label
- name: rewrite_tag
match: kube.*
rule: $kubernetes['labels']['team'] ^(backend)$ team.backend.$TAG false
rule: $kubernetes['labels']['team'] ^(frontend)$ team.frontend.$TAG false
rule: $kubernetes['labels']['team'] ^(data)$ team.data.$TAG false
outputs:
# Backend team -> Dedicated Elasticsearch index
- name: es
match: team.backend.*
host: elasticsearch.logging.svc
logstash_format: on
logstash_prefix: team-backend
# Frontend team -> Dedicated Loki tenant
- name: loki
match: team.frontend.*
host: loki.logging.svc
tenant_id: frontend-team
labels:
team=frontend
# Data team -> S3 + Elasticsearch dual
- name: es
match: team.data.*
host: elasticsearch.logging.svc
logstash_format: on
logstash_prefix: team-data
- name: s3
match: team.data.*
bucket: data-team-logs
region: ap-northeast-2
compression: gzip
10.7 ServiceAccount and RBAC Configuration
# fluent-bit-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluent-bit
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluent-bit
rules:
- apiGroups: ['']
resources:
- namespaces
- pods
- pods/logs
- nodes
- nodes/proxy
verbs: ['get', 'list', 'watch']
- apiGroups: ['']
resources:
- events
verbs: ['get', 'list', 'watch']
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluent-bit
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluent-bit
subjects:
- kind: ServiceAccount
name: fluent-bit
namespace: logging
11. Practical Pipeline Examples
11.1 Example 1: K8s Logs to Elasticsearch + Kibana
The most common EFK (Elasticsearch + Fluent Bit + Kibana) stack configuration.
# fluent-bit-efk.yaml
service:
flush: 5
log_level: info
parsers_file: /fluent-bit/etc/parsers.conf
http_server: on
http_listen: 0.0.0.0
http_port: 2020
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal
storage.max_chunks_up: 128
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
multiline.parser: docker, cri
db: /var/log/flb_kube.db
mem_buf_limit: 5MB
skip_long_lines: on
refresh_interval: 10
storage.type: filesystem
filters:
- name: kubernetes
match: kube.*
kube_url: https://kubernetes.default.svc:443
kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
merge_log: on
keep_log: off
k8s-logging.parser: on
k8s-logging.exclude: on
labels: on
# Exclude kube-system and logging namespace logs
- name: grep
match: kube.*
exclude: $kubernetes['namespace_name'] ^(kube-system|logging)$
# Add environment info
- name: modify
match: kube.*
add: cluster_name my-eks-cluster
add: environment production
outputs:
- name: es
match: kube.*
host: elasticsearch-master.logging.svc.cluster.local
port: 9200
http_user: elastic
http_passwd: ${ES_PASSWORD}
logstash_format: on
logstash_prefix: kube-logs
logstash_dateformat: "%Y.%m.%d"
time_key: "@timestamp"
include_tag_key: true
tag_key: fluentbit_tag
suppress_type_name: on
generate_id: on
retry_limit: false
workers: 2
tls: on
tls.verify: on
tls.ca_file: /certs/es-ca.pem
11.2 Example 2: K8s Logs to Grafana Loki + Grafana
# fluent-bit-loki.yaml
service:
flush: 5
log_level: info
http_server: on
http_listen: 0.0.0.0
http_port: 2020
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
multiline.parser: docker, cri
db: /var/log/flb_kube.db
mem_buf_limit: 5MB
skip_long_lines: on
filters:
- name: kubernetes
match: kube.*
merge_log: on
keep_log: off
labels: on
annotations: off
outputs:
- name: loki
match: kube.*
host: loki-gateway.logging.svc.cluster.local
port: 3100
uri: /loki/api/v1/push
labels: job=fluent-bit,cluster=my-cluster
label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
auto_kubernetes_labels: on
line_format: json
workers: 2
retry_limit: false
11.3 Example 3: S3 Archiving + Elasticsearch Dual Output
A pattern that simultaneously sends a single input to two Outputs.
# fluent-bit-dual.yaml
service:
flush: 5
log_level: info
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal
storage.max_chunks_up: 128
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
multiline.parser: docker, cri
db: /var/log/flb_kube.db
storage.type: filesystem
filters:
- name: kubernetes
match: kube.*
merge_log: on
keep_log: off
labels: on
outputs:
# Elasticsearch for real-time search/analysis
- name: es
match: kube.*
host: elasticsearch.logging.svc
port: 9200
logstash_format: on
logstash_prefix: kube-logs
suppress_type_name: on
retry_limit: false
workers: 2
# S3 archiving for long-term retention
- name: s3
match: kube.*
region: ap-northeast-2
bucket: log-archive-bucket
s3_key_format: /kubernetes/$TAG/%Y/%m/%d/%H/$UUID.gz
s3_key_format_tag_delimiters: .
total_file_size: 100M
upload_timeout: 10m
compression: gzip
content_type: application/gzip
store_dir: /tmp/fluent-bit-s3
store_dir_limit_size: 1G
use_put_object: on
retry_limit: 5
workers: 1
11.4 Example 4: Per-Namespace Filtering and Multi-Destination Routing
# fluent-bit-routing.yaml
service:
flush: 5
log_level: info
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
multiline.parser: docker, cri
db: /var/log/flb_kube.db
storage.type: filesystem
filters:
- name: kubernetes
match: kube.*
merge_log: on
labels: on
# Per-Namespace tag rewriting
- name: rewrite_tag
match: kube.*
rule: $kubernetes['namespace_name'] ^(production)$ route.prod.$TAG false
rule: $kubernetes['namespace_name'] ^(staging)$ route.stg.$TAG false
rule: $kubernetes['namespace_name'] ^(kube-system)$ route.sys.$TAG false
# Extract only ERRORs from production
- name: rewrite_tag
match: route.prod.*
rule: $log ^.*ERROR.*$ alert.prod.$TAG true
outputs:
# All production logs -> Elasticsearch
- name: es
match: route.prod.*
host: es-prod.logging.svc
port: 9200
logstash_format: on
logstash_prefix: prod-all
suppress_type_name: on
workers: 2
# Production ERROR alerts -> Separate index
- name: es
match: alert.prod.*
host: es-prod.logging.svc
port: 9200
logstash_format: on
logstash_prefix: prod-alerts
suppress_type_name: on
# Staging logs -> Loki (cost-efficient)
- name: loki
match: route.stg.*
host: loki.logging.svc
port: 3100
labels:
env=staging
auto_kubernetes_labels: on
line_format: json
# System logs -> S3 archiving (long-term retention)
- name: s3
match: route.sys.*
bucket: system-logs-archive
region: ap-northeast-2
total_file_size: 50M
upload_timeout: 15m
compression: gzip
s3_key_format: /kube-system/%Y/%m/%d/$UUID.gz
# Unclassified logs -> Default Elasticsearch
- name: es
match: kube.*
host: es-default.logging.svc
port: 9200
logstash_format: on
logstash_prefix: default-logs
suppress_type_name: on
12. Performance Tuning
12.1 Workers Configuration
Setting the Workers parameter on Output plugins enables parallel transmission. Each Worker operates as an independent thread.
pipeline:
outputs:
- name: es
match: '*'
host: elasticsearch
port: 9200
workers: 4 # 4 parallel Workers
net.keepalive: on
net.keepalive_idle_timeout: 30
Workers Configuration Guidelines
| Scenario | Recommended Workers | Notes |
|---|---|---|
| Low throughput | 0-1 | Default is sufficient |
| Medium throughput | 2-4 | Most production environments |
| High throughput | 4-8 | Consider CPU core count |
| Very high throughput | 8 or more | Check network/destination bottleneck |
12.2 Flush Interval Optimization
The Flush value in seconds determines how often data is transmitted from the buffer to the Output.
service:
flush: 1 # Flush every 1 second (prioritize real-time)
# flush: 5 # Flush every 5 seconds (prioritize throughput)
| Flush Value | Characteristics |
|---|---|
| 1 second | Low latency, higher CPU usage |
| 5 seconds | Balanced setting (default recommended) |
| 10+ seconds | Higher throughput, larger batch size |
12.3 Buffer Size Adjustment
pipeline:
inputs:
- name: tail
path: /var/log/containers/*.log
# High throughput environment
buffer_chunk_size: 512KB # Chunk unit size (default 32KB)
buffer_max_size: 5MB # Maximum buffer size (default 32KB)
mem_buf_limit: 50MB # Memory buffer limit
outputs:
- name: es
match: '*'
buffer_size: 512KB # HTTP buffer size
12.4 Pipeline Parallelization
In environments requiring high throughput, Input and Output can be separated to configure parallel pipelines.
pipeline:
inputs:
# Application log pipeline
- name: tail
tag: app.*
path: /var/log/containers/app-*.log
multiline.parser: docker, cri
threaded: on # Run in a separate thread
# System log pipeline
- name: tail
tag: sys.*
path: /var/log/containers/kube-*.log
multiline.parser: docker, cri
threaded: on
outputs:
- name: es
match: app.*
host: elasticsearch
workers: 4
- name: es
match: sys.*
host: elasticsearch
workers: 2
12.5 Hot Reload
The Hot Reload feature, supported since v2.1, allows configuration reloading without service interruption.
# Enable Hot Reload
service:
hot_reload: on
There are three ways to trigger a reload:
# Method 1: SIGHUP signal
kill -SIGHUP $(pidof fluent-bit)
# Method 2: HTTP API (v4.0+)
curl -X POST http://localhost:2020/api/v2/reload
# Method 3: Command line option
fluent-bit -c fluent-bit.yaml -Y # --enable-hot-reload
Hot Reload Considerations
- Data in the buffer is preserved
- For filesystem buffers, data on disk is also maintained
- If the new configuration has syntax errors, the reload fails and the existing configuration is retained
- SIGHUP is not supported on Windows
12.6 Memory Usage Monitoring
# Check Fluent Bit process memory
ps aux | grep fluent-bit
# Check internal metrics via HTTP API
curl -s http://localhost:2020/api/v1/storage | jq .
# Check via Prometheus metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep fluentbit_input
12.7 Performance Benchmark Reference
The following are reference performance figures for Fluent Bit in typical environments (results vary depending on hardware, network, and configuration).
| Scenario | Throughput | CPU Usage | Memory |
|---|---|---|---|
| Simple forwarding (tail input, stdout output) | ~100K events/s | 5-10% (1 core) | 5-10MB |
| K8s Filter + ES output | ~40-60K events/s | 15-25% (1 core) | 30-50MB |
| K8s Filter + Lua transform + ES output | ~20-40K events/s | 25-40% (1 core) | 40-80MB |
| Complex pipeline (multiple Filters + dual output) | ~15-30K events/s | 30-50% (1 core) | 50-100MB |
v4.1 SIMD optimization: JSON encoding is processed with SIMD (Single Instruction, Multiple Data) instructions, improving JSON conversion performance by 2.5x.
13. Monitoring and Observability
13.1 Built-in HTTP Monitoring Endpoints
Fluent Bit can monitor its own status through a built-in HTTP server.
service:
http_server: on
http_listen: 0.0.0.0
http_port: 2020
health_check: on
hc_errors_count: 5 # Unhealthy when this many errors occur
hc_retry_failure_count: 5 # Unhealthy when this many retry failures occur
hc_period: 60 # Health check interval (seconds)
Available Endpoints
| Endpoint | Description |
|---|---|
/ | Fluent Bit build information |
/api/v1/health | Health check (200=OK, 500=Unhealthy) |
/api/v1/metrics | JSON format metrics |
/api/v1/metrics/prometheus | Prometheus format metrics |
/api/v2/metrics | v2 metrics endpoint |
/api/v2/metrics/prometheus | v2 Prometheus metrics |
/api/v1/storage | Storage/buffer status |
/api/v2/reload | Hot Reload trigger (POST) |
/api/v1/uptime | Uptime |
# Health check
curl -s http://localhost:2020/api/v1/health
# Response: ok (HTTP 200) or error (HTTP 500)
# Check storage status
curl -s http://localhost:2020/api/v1/storage | jq .
# {
# "storage_layer": {
# "chunks": {
# "total_chunks": 15,
# "mem_chunks": 10,
# "fs_chunks": 5,
# "fs_chunks_up": 3,
# "fs_chunks_down": 2
# }
# }
# }
# Check Prometheus metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus
13.2 Prometheus Metrics Collection Configuration
A configuration for collecting Fluent Bit internal metrics with Prometheus and visualizing them in Grafana.
# Expose Fluent Bit's own metrics to Prometheus
pipeline:
inputs:
- name: fluentbit_metrics
tag: fb.metrics
scrape_interval: 30
outputs:
- name: prometheus_exporter
match: fb.metrics
host: 0.0.0.0
port: 2021
Kubernetes ServiceMonitor Configuration
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: fluent-bit
namespace: logging
labels:
release: prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/name: fluent-bit
endpoints:
- port: http
path: /api/v2/metrics/prometheus
interval: 30s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- logging
Key Prometheus Metrics
| Metric | Type | Description |
|---|---|---|
fluentbit_input_records_total | Counter | Records collected per Input |
fluentbit_input_bytes_total | Counter | Bytes collected per Input |
fluentbit_output_proc_records_total | Counter | Records processed per Output |
fluentbit_output_proc_bytes_total | Counter | Bytes processed per Output |
fluentbit_output_errors_total | Counter | Errors per Output |
fluentbit_output_retries_total | Counter | Retries per Output |
fluentbit_output_retries_failed_total | Counter | Failed retries per Output |
fluentbit_filter_records_total | Counter | Records processed per Filter |
fluentbit_uptime | Gauge | Uptime (seconds) |
fluentbit_storage_chunks | Gauge | Number of storage chunks |
13.3 Grafana Dashboards
Key panels to include when building a Fluent Bit monitoring dashboard in Grafana:
Dashboard Panel Configuration
| Panel | PromQL Example | Purpose |
|---|---|---|
| Input Throughput | rate(fluentbit_input_records_total[5m]) | Records collected per second |
| Output Throughput | rate(fluentbit_output_proc_records_total[5m]) | Records transmitted per second |
| Output Error Rate | rate(fluentbit_output_errors_total[5m]) | Errors per second |
| Retry Rate | rate(fluentbit_output_retries_total[5m]) | Retry trends |
| Buffer Usage | fluentbit_storage_chunks | Current buffer chunks |
| Uptime | fluentbit_uptime | Process stability |
| Input/Output Delta | rate(input[5m]) - rate(output[5m]) | Backpressure detection |
Import Grafana Labs community dashboard (ID: 7752) for a quick setup of a basic Fluent Bit monitoring dashboard.
14. Troubleshooting Guide
14.1 When Logs Are Not Being Collected
Checklist
# 1. Check Fluent Bit process status
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl logs -n logging <fluent-bit-pod> --tail=50
# 2. Validate configuration file syntax
fluent-bit -c fluent-bit.yaml --dry-run
# 3. Verify log file paths
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/containers/
# 4. Check DB file (offset)
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/flb_kube.db
# 5. Check permissions
kubectl exec -n logging <fluent-bit-pod> -- cat /var/log/containers/<target-log>
Common Causes and Solutions
| Cause | Symptom | Solution |
|---|---|---|
| Wrong file path | Record count 0 at Input | Verify Path, check wildcard patterns |
| Insufficient perms | Permission denied error | Check SecurityContext, hostPath permissions |
| Corrupt DB file | Offset points to end of file | Delete DB file and restart |
| Parser mismatch | Parsing failure, empty records | Verify original with stdout Output |
| Tag match failure | Not reaching Filter/Output | Verify Match pattern matches Tag |
14.2 Memory Leak / OOM
# Memory limit configuration
pipeline:
inputs:
- name: tail
path: /var/log/containers/*.log
mem_buf_limit: 5MB # Input memory limit
skip_long_lines: on # Skip long lines
buffer_chunk_size: 32KB # Chunk size limit
buffer_max_size: 32KB # Maximum buffer
service:
storage.path: /var/log/fluent-bit/buffer/
storage.max_chunks_up: 64 # Limit memory chunk count (default 128)
OOM Prevention Checklist
- Verify
Mem_Buf_Limitorstorage.max_chunks_upis configured - Enable
Skip_Long_Lines: On - Remove unnecessary Filters (especially when Lua creates large tables)
- Check Kubernetes Filter
Buffer_Size(0=unlimited) - Set appropriate Kubernetes Resource Limits
14.3 Log Loss Due to Backpressure
# Check for Backpressure
curl -s http://localhost:2020/api/v1/storage | jq .
# Check via metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep -E "retries|errors|dropped"
Response Steps When Backpressure Occurs
- Increase Output
Workerscount for parallel transmission - Reduce
Flushinterval for more frequent transmission (e.g., 5 to 1) - Switch to
storage.type: filesystemfor disk buffer utilization - Set
storage.total_limit_sizesufficiently - Check and scale up destination (Elasticsearch, etc.) processing capacity
# Backpressure response configuration example
service:
flush: 1
storage.path: /var/log/fluent-bit/buffer/
storage.sync: normal
storage.max_chunks_up: 256
pipeline:
inputs:
- name: tail
tag: kube.*
path: /var/log/containers/*.log
storage.type: filesystem
storage.pause_on_chunks_overlimit: off
outputs:
- name: es
match: kube.*
host: elasticsearch
port: 9200
workers: 4
storage.total_limit_size: 10G
retry_limit: false
net.keepalive: on
net.keepalive_idle_timeout: 15
14.4 TLS/Authentication Errors
# TLS connection test
kubectl exec -n logging <fluent-bit-pod> -- \
curl -v --cacert /certs/ca.pem https://elasticsearch:9200
# Check certificate expiration
kubectl exec -n logging <fluent-bit-pod> -- \
openssl x509 -in /certs/ca.pem -noout -enddate
Common TLS Errors and Solutions
| Error | Cause | Solution |
|---|---|---|
SSL_ERROR_SYSCALL | Certificate path error | Verify tls.ca_file path |
certificate verify failed | CA certificate mismatch | Use correct CA certificate |
certificate has expired | Certificate expired | Renew certificate |
connection refused | Port/host error | Verify Host, Port, TLS port |
401 Unauthorized | Authentication failure | Verify http_user, http_passwd |
14.5 Debugging Methods
Step 1: Set Log Level to debug
service:
log_level: debug # error, warn, info, debug, trace
Step 2: Add stdout Output
pipeline:
outputs:
# For debugging: all records to standard output
- name: stdout
match: '*'
format: json_lines
# Actual Output
- name: es
match: '*'
host: elasticsearch
Step 3: Stage-by-Stage Pipeline Verification
# Test Input only
fluent-bit -i tail -p path=/var/log/test.log -o stdout
# Test Parser
fluent-bit -i tail -p path=/var/log/test.log -p parser=json -o stdout
# Test full configuration (dry-run)
fluent-bit -c fluent-bit.yaml --dry-run
Step 4: Real-Time Log Viewing in Kubernetes
# Stream Fluent Bit logs
kubectl logs -n logging -l app.kubernetes.io/name=fluent-bit -f --tail=100
# Specific Pod logs
kubectl logs -n logging fluent-bit-xxxxx -f
# Previous container logs (on crash)
kubectl logs -n logging fluent-bit-xxxxx --previous
15. Operational Best Practices
15.1 Production Checklist
| Item | Recommended Setting | Reason |
|---|---|---|
storage.type | filesystem | Prevent data loss |
storage.path | Separate volume mount | Isolate disk I/O |
storage.total_limit_size | 50-70% of free disk space | Prevent disk full |
Retry_Limit | false (unlimited) or sufficient | Recover from transient failures |
Workers | 2-4 | Parallel transmission performance |
Hot_Reload | on | Zero-downtime config changes |
HTTP_Server | on | Enable monitoring |
health_check | on | Kubernetes Probe integration |
Skip_Long_Lines | on | Prevent failures from abnormal logs |
| Resource Limits | Appropriate CPU/Memory | Prevent OOM |
15.2 Security Recommendations
# Kubernetes Pod Security
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
capabilities:
drop: ['ALL']
# Mount only required volumes
volumes:
- name: varlog
hostPath:
path: /var/log
type: ''
- name: buffer
emptyDir:
sizeLimit: 2Gi
# Enforce TLS (Output)
pipeline:
outputs:
- name: es
tls: on
tls.verify: on
tls.ca_file: /certs/ca.pem
15.3 Log Rotation Handling
Fluent Bit's tail Input automatically detects log rotation. However, the following settings should be verified:
pipeline:
inputs:
- name: tail
path: /var/log/app/*.log
db: /var/log/flb_app.db # Required: offset tracking
rotate_wait: 5 # Wait time after rotation (seconds)
refresh_interval: 10 # File list refresh interval
16. References
Official Documentation and Repositories
- Fluent Bit Official Documentation - Complete configuration reference and guides
- Fluent Bit GitHub - Source code and issue tracker
- Fluent Bit Helm Charts - Helm Charts for Kubernetes deployment
- Fluent Bit Performance Tools - Benchmark tools
CNCF Resources
- CNCF Fluent Bit v3 Announcement
- CNCF Fluent Bit v3.2 Announcement
- CNCF Fluentd to Fluent Bit Migration Guide
- CNCF Parsing 101 with Fluent Bit