Split View: Fluent Bit 완벽 가이드: 경량 로그 프로세서의 아키텍처, 설정, Kubernetes 연동까지 총정리

Fluent Bit 완벽 가이드: 경량 로그 프로세서의 아키텍처, 설정, Kubernetes 연동까지 총정리

1. Fluent Bit 소개
2. 핵심 아키텍처: 파이프라인 구조
3. 설치 방법
4. 설정 파일 구조
5. Input 플러그인 상세
6. Parser 설정
7. Filter 플러그인 상세
8. Output 플러그인 상세
9. Buffer와 Backpressure 관리
10. Kubernetes 연동 완벽 가이드
11. 실전 파이프라인 예제
12. 성능 튜닝
13. 모니터링 및 Observability
14. 트러블슈팅 가이드
15. 운영 Best Practices
16. 참고 자료

1. Fluent Bit 소개

1.1 Fluent Bit이란?

Fluent Bit은 C 언어로 작성된 초경량 텔레메트리 에이전트로, 다양한 소스에서 로그(Logs), 메트릭(Metrics), **트레이스(Traces)**를 수집하고, 처리한 뒤 원하는 목적지로 전달하는 역할을 한다. 바이너리 크기 약 450KB, 메모리 사용량 1MB 미만이라는 극도로 가벼운 풋프린트를 자랑하며, 임베디드 시스템부터 대규모 Kubernetes 클러스터까지 폭넓게 활용된다.

Fluent Bit은 CNCF(Cloud Native Computing Foundation) Graduated 프로젝트로, Kubernetes, Prometheus, Envoy 등과 같은 레벨의 성숙도를 인정받았다. Fluentd 프로젝트의 산하에서 2019년에 졸업 지위를 획득했으며, 2024년 기준 DockerHub에서 130억 회 이상 다운로드되며 클라우드 네이티브 로그 수집의 사실상 표준으로 자리잡았다.

2024년 3월 KubeCon + CloudNativeCon EU에서 Fluent Bit v3가 발표되었으며, 같은 해 12월 v3.2, 2025년 3월에는 v4.0이 출시되어 YAML 표준 설정, Processor 지원, SIMD 기반 JSON 인코딩(2.5배 성능 향상), OpenTelemetry 강화 등 지속적으로 진화하고 있다.

1.2 핵심 특징

초경량: C 기반, ~450KB 바이너리, 1MB 미만 메모리 사용
고성능: 비동기 I/O, 멀티스레드 파이프라인, SIMD 최적화
플러그인 아키텍처: 100개 이상의 Input/Filter/Output 플러그인
통합 텔레메트리: Logs, Metrics, Traces 단일 에이전트로 처리
YAML 네이티브: v3.2부터 YAML이 표준 설정 포맷
Hot Reload: 서비스 중단 없이 설정 재로드 (SIGHUP / HTTP API)
크로스 플랫폼: Linux, macOS, Windows, BSD, 임베디드 Linux 지원

1.3 Fluent Bit vs Fluentd 비교

항목	Fluent Bit	Fluentd
개발 언어	C	Ruby + C
바이너리 크기	~450KB	~40MB
메모리 사용량	~1MB	~30-40MB
플러그인 수	100개 이상 (내장)	1,000개 이상 (gem 포함)
성능(처리량)	매우 높음	높음
CPU 사용량	낮음	Fluent Bit 대비 4배
설정 형식	INI / YAML	Ruby DSL
주요 용도	에지/노드 레벨 수집 및 전달	중앙 집중식 로그 집계
Kubernetes	DaemonSet으로 노드별 배포	Aggregator로 중앙 배포
CNCF 상태	Graduated (Fluentd 산하)	Graduated
적합 환경	IoT, 컨테이너, 에지	대규모 집계, 복잡한 라우팅

일반적 권장 아키텍처: Fluent Bit을 각 노드에 DaemonSet으로 배포하여 로그를 수집하고, 필요시 중앙의 Fluentd Aggregator로 전달하는 하이브리드 패턴이 가장 널리 사용된다. 다만 Fluent Bit의 기능이 꾸준히 강화되면서 Fluentd 없이 Fluent Bit만으로 전체 파이프라인을 구성하는 사례도 빠르게 늘고 있다.

1.4 아키텍처 개요

+------------------------------------------------------------------+
|                        Fluent Bit Engine                         |
|                                                                  |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|  | Input  |-->| Parser |-->| Filter |-->| Buffer |-->| Output | |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|  | tail   |   | json   |   | k8s    |   | memory |   | es     | |
|  | systemd|   | regex  |   | grep   |   | filesys|   | loki   | |
|  | forward|   | logfmt |   | modify |   |        |   | s3     | |
|  | http   |   | cri    |   | lua    |   |        |   | kafka  | |
|  | tcp    |   | docker |   | nest   |   |        |   | stdout | |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|                                                                  |
|  [Scheduler]  [Router / Tag Matching]  [HTTP Server / Monitoring]|
+------------------------------------------------------------------+

Fluent Bit의 데이터 처리는 파이프라인(Pipeline) 구조를 따르며, 각 단계가 명확하게 분리되어 있다. 이 구조 덕분에 모듈별 독립적 확장과 교체가 가능하다.

2. 핵심 아키텍처: 파이프라인 구조

2.1 파이프라인 전체 흐름

Fluent Bit의 데이터 처리 파이프라인은 다음과 같은 단계로 구성된다.

  [Data Source]
       |
       v
  +---------+     +---------+     +---------+     +---------+     +---------+
  |  INPUT  | --> | PARSER  | --> | FILTER  | --> | BUFFER  | --> | OUTPUT  |
  |         |     |         |     |         |     |         |     |         |
  | 데이터   |     | 비정형   |     | 데이터   |     | 메모리/  |     | 최종    |
  | 수집     |     | -> 정형  |     | 가공     |     | 디스크   |     | 전달    |
  +---------+     +---------+     +---------+     +---------+     +---------+
       |               |               |               |               |
    Tag 부여       구조화 변환      enrichment      안정성 보장    목적지 전송
                                   필터링                          (Tag 매칭)

2.2 각 단계별 역할

Input (입력)

데이터의 진입점이다. 파일, 시스템 저널, 네트워크 소켓, Kubernetes 이벤트 등 다양한 소스에서 데이터를 수집한다. 모든 입력 데이터에는 Tag가 부여되며, 이 Tag는 이후 라우팅에 사용된다.

Parser (파서)

비정형(unstructured) 텍스트 데이터를 정형(structured) 데이터로 변환한다. JSON, 정규식(Regex), Logfmt, Docker, CRI 등 다양한 파서를 제공하며, Input 플러그인 단계에서 적용된다.

Filter (필터)

수집된 데이터를 가공하는 단계이다. 필드 추가/삭제, Kubernetes 메타데이터 enrichment, 정규식 기반 필터링, Lua 스크립트 변환 등을 수행한다. 여러 Filter를 체이닝하여 복잡한 변환 로직을 구성할 수 있다.

Buffer (버퍼)

Filter를 통과한 데이터는 Output으로 전달되기 전에 버퍼에 저장된다. 메모리 버퍼와 파일시스템 버퍼 두 가지 방식을 지원하며, 파일시스템 버퍼를 사용하면 장애 시에도 데이터 손실을 방지할 수 있다.

Router (라우터)

Tag 매칭 규칙에 따라 데이터를 적절한 Output으로 라우팅한다. 와일드카드(*) 매칭을 지원하며, 하나의 입력을 여러 Output으로 동시에 전달(fan-out)할 수 있다.

Output (출력)

최종 목적지로 데이터를 전송한다. Elasticsearch, Loki, S3, Kafka, CloudWatch, Prometheus 등 다양한 백엔드를 지원한다. 전송 실패 시 자동 재시도(Retry) 메커니즘이 작동한다.

2.3 멀티 파이프라인 구조

Fluent Bit은 단일 인스턴스에서 여러 개의 독립적인 파이프라인을 동시에 운영할 수 있다. 각 파이프라인은 고유한 Input, Filter, Output 조합을 가지며, Tag 기반 라우팅으로 서로 다른 데이터 흐름을 분리한다.

Pipeline A: [tail: app-*.log] --tag:app--> [filter:k8s] --> [output:elasticsearch]
Pipeline B: [tail: sys-*.log] --tag:sys--> [filter:grep] --> [output:loki]
Pipeline C: [forward:24224]   --tag:fwd--> [filter:lua]  --> [output:s3]

이 구조는 다음과 같은 이점을 제공한다.

격리성: 파이프라인 간 독립적 동작으로 장애 전파 방지
유연성: 용도별로 다른 처리 로직과 목적지 지정
효율성: 단일 에이전트로 다양한 데이터 흐름 처리

3. 설치 방법

3.1 Linux (apt/yum)

Ubuntu/Debian (apt)

# GPG 키 및 저장소 추가
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

# 또는 수동 설치
wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -
echo "deb https://packages.fluentbit.io/ubuntu/$(lsb_release -cs) $(lsb_release -cs) main" | \
  sudo tee /etc/apt/sources.list.d/fluent-bit.list

sudo apt-get update
sudo apt-get install -y fluent-bit

# 서비스 시작
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit

CentOS/RHEL (yum)

cat > /etc/yum.repos.d/fluent-bit.repo << 'EOF'
[fluent-bit]
name=Fluent Bit
baseurl=https://packages.fluentbit.io/centos/$releasever/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1
EOF

sudo yum install -y fluent-bit
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit

3.2 Docker

# 최신 버전 실행
docker run -ti cr.fluentbit.io/fluent/fluent-bit:latest

# 설정 파일 마운트
docker run -ti \
  -v /path/to/fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml \
  -v /var/log:/var/log \
  cr.fluentbit.io/fluent/fluent-bit:latest \
  /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.yaml

# Docker Compose 예시
cat > docker-compose.yaml << 'EOF'
version: '3.8'
services:
  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit:latest
    volumes:
      - ./fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml
      - /var/log:/var/log:ro
    ports:
      - "2020:2020"   # HTTP 모니터링
      - "24224:24224"  # Forward 프로토콜
EOF

3.3 macOS (Homebrew)

brew install fluent-bit

# 실행
fluent-bit -c /opt/homebrew/etc/fluent-bit/fluent-bit.conf

# 또는 YAML 설정 파일로 실행
fluent-bit -c /path/to/fluent-bit.yaml

3.4 Kubernetes (Helm Chart)

# Helm 저장소 추가
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

# 기본 설치
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace

# values.yaml 커스터마이징으로 설치
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  -f custom-values.yaml

# 업그레이드
helm upgrade fluent-bit fluent/fluent-bit \
  --namespace logging \
  -f custom-values.yaml

3.5 바이너리 직접 설치

# GitHub Releases에서 바이너리 다운로드
FLUENT_BIT_VERSION=3.2.2
wget https://github.com/fluent/fluent-bit/releases/download/v${FLUENT_BIT_VERSION}/fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz

tar xzf fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz
cd fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64/

# 실행
./bin/fluent-bit -c conf/fluent-bit.yaml

# 버전 확인
./bin/fluent-bit --version

4. 설정 파일 구조

Fluent Bit은 **Classic 모드(INI 형식)**와 YAML 모드 두 가지 설정 형식을 지원한다. v3.2부터 YAML이 표준 설정 포맷이며, Classic 모드는 2025년 말 deprecated 예정이다.

4.1 Classic 모드 (fluent-bit.conf)

# fluent-bit.conf - Classic INI 형식

[SERVICE]
    Flush         5
    Daemon        Off
    Log_Level     info
    Parsers_File  parsers.conf
    HTTP_Server   On
    HTTP_Listen   0.0.0.0
    HTTP_Port     2020
    Hot_Reload    On

[INPUT]
    Name          tail
    Path          /var/log/containers/*.log
    Parser        cri
    Tag           kube.*
    Mem_Buf_Limit 5MB
    Skip_Long_Lines On
    Refresh_Interval 10
    DB            /var/log/flb_kube.db

[FILTER]
    Name          kubernetes
    Match         kube.*
    Kube_URL      https://kubernetes.default.svc:443
    Kube_CA_File  /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log     On
    K8S-Logging.Parser On
    K8S-Logging.Exclude On

[OUTPUT]
    Name          es
    Match         kube.*
    Host          elasticsearch.logging.svc.cluster.local
    Port          9200
    Logstash_Format On
    Logstash_Prefix kube
    Retry_Limit   False

4.2 YAML 모드 (fluent-bit.yaml)

# fluent-bit.yaml - YAML 형식 (v3.2+ 표준)
service:
  flush: 5
  daemon: off
  log_level: info
  parsers_file: parsers.conf
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  hot_reload: on

pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      parser: cri
      tag: kube.*
      mem_buf_limit: 5MB
      skip_long_lines: on
      refresh_interval: 10
      db: /var/log/flb_kube.db

  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on
      k8s-logging.parser: on
      k8s-logging.exclude: on

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch.logging.svc.cluster.local
      port: 9200
      logstash_format: on
      logstash_prefix: kube
      retry_limit: false

4.3 두 형식 비교

항목	Classic (INI)	YAML
파일 확장자	.conf	.yaml / .yml
상태	Deprecated 예정 (2025 말)	표준 (v3.2+)
Processor 지원	미지원	지원
가독성	보통	우수
중첩 구조	제한적	완전 지원
배열/리스트	미지원	지원
주석	#	#

4.4 환경변수 사용

두 형식 모두 ${ENV_VAR} 문법으로 환경변수를 참조할 수 있다.

# YAML 환경변수 예시
pipeline:
  outputs:
    - name: es
      match: '*'
      host: ${ELASTICSEARCH_HOST}
      port: ${ELASTICSEARCH_PORT}
      http_user: ${ES_USER}
      http_passwd: ${ES_PASSWORD}
      tls: ${ES_TLS_ENABLED}

# Classic 환경변수 예시
[OUTPUT]
    Name          es
    Match         *
    Host          ${ELASTICSEARCH_HOST}
    Port          ${ELASTICSEARCH_PORT}
    HTTP_User     ${ES_USER}
    HTTP_Passwd   ${ES_PASSWORD}

v4.0부터는 file:// 프리픽스를 사용하여 파일시스템에서 시크릿 값을 안전하게 참조할 수 있다.

pipeline:
  outputs:
    - name: es
      http_passwd: file:///run/secrets/es-password

4.5 @INCLUDE 디렉티브

설정 파일을 모듈화하여 관리할 수 있다.

# fluent-bit.conf (Classic)
[SERVICE]
    Flush 5

@INCLUDE inputs.conf
@INCLUDE filters.conf
@INCLUDE outputs.conf

YAML에서는 includes 섹션을 활용한다.

# fluent-bit.yaml
includes:
  - inputs.yaml
  - filters.yaml
  - outputs.yaml

service:
  flush: 5

5. Input 플러그인 상세

Input 플러그인은 데이터 수집의 시작점이다. 각 Input에는 고유한 Tag가 부여되어 이후 Filter와 Output의 매칭 기준이 된다.

5.1 tail: 파일 로그 수집

가장 많이 사용되는 Input 플러그인으로, tail -f처럼 파일의 끝부터 새로운 라인을 실시간으로 읽는다.

pipeline:
  inputs:
    - name: tail
      tag: app.logs
      path: /var/log/app/*.log
      path_key: filename # 파일 경로를 레코드에 포함
      exclude_path: /var/log/app/debug.log # 특정 파일 제외
      parser: json # 기본 파서
      db: /var/log/flb_app.db # offset 저장 DB (재시작 시 이어서 수집)
      db.sync: normal # DB 동기화 모드
      refresh_interval: 10 # 파일 목록 갱신 주기 (초)
      read_from_head: false # true: 파일 처음부터 읽기
      skip_long_lines: on # 매우 긴 라인 스킵
      mem_buf_limit: 5MB # 메모리 버퍼 한도
      rotate_wait: 5 # 로테이션 대기 시간 (초)
      multiline.parser: docker, cri # 멀티라인 파서

주요 설정 항목 설명

설정	기본값	설명
`Path`	(필수)	로그 파일 경로 (와일드카드 지원)
`Path_Key`	-	파일 경로를 레코드에 키로 추가
`Exclude_Path`	-	제외할 파일 경로
`DB`	-	파일 offset 저장 SQLite DB 경로
`Refresh_Interval`	60	파일 목록 갱신 주기 (초)
`Read_from_Head`	false	파일 처음부터 읽을지 여부
`Skip_Long_Lines`	Off	Buffer_Max_Size 초과 라인 스킵
`Mem_Buf_Limit`	-	메모리 버퍼 한도
`Rotate_Wait`	5	로그 로테이션 후 대기 시간

5.2 systemd: systemd journal 수집

pipeline:
  inputs:
    - name: systemd
      tag: host.systemd
      systemd_filter: _SYSTEMD_UNIT=docker.service
      systemd_filter: _SYSTEMD_UNIT=kubelet.service
      read_from_tail: on
      strip_underscores: on
      db: /var/log/flb_systemd.db

5.3 forward: Fluentd 프로토콜 수신

pipeline:
  inputs:
    - name: forward
      tag: forward.incoming
      listen: 0.0.0.0
      port: 24224
      buffer_chunk_size: 1M
      buffer_max_size: 6M

5.4 http / tcp / udp: 네트워크 수신

pipeline:
  inputs:
    # HTTP 수신
    - name: http
      tag: http.logs
      listen: 0.0.0.0
      port: 9880
      successful_response_code: 201

    # TCP 수신
    - name: tcp
      tag: tcp.logs
      listen: 0.0.0.0
      port: 5170
      format: json

    # UDP 수신
    - name: udp
      tag: udp.logs
      listen: 0.0.0.0
      port: 5170
      format: json

5.5 kubernetes_events: K8s 이벤트 수집

pipeline:
  inputs:
    - name: kubernetes_events
      tag: kube.events
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      interval_sec: 1
      retention_time: 1h

5.6 node_exporter_metrics / prometheus_scrape

pipeline:
  inputs:
    # 노드 메트릭 수집
    - name: node_exporter_metrics
      tag: node.metrics
      scrape_interval: 30

    # Prometheus 엔드포인트 스크레이핑
    - name: prometheus_scrape
      tag: prom.metrics
      host: 127.0.0.1
      port: 9090
      metrics_path: /metrics
      scrape_interval: 10s

5.7 fluentbit_metrics: 내부 메트릭

pipeline:
  inputs:
    - name: fluentbit_metrics
      tag: fb.metrics
      scrape_interval: 30
      scrape_on_start: true

6. Parser 설정

Parser는 비정형 텍스트 로그를 정형 데이터로 변환하는 핵심 컴포넌트이다. 별도의 parsers.conf 또는 YAML 파일에 정의한다.

6.1 내장 파서

Fluent Bit은 자주 사용되는 로그 형식에 대한 내장 파서를 제공한다.

파서	형식	용도
`json`	JSON	JSON 형식 로그
`docker`	JSON (Docker 특화)	Docker 컨테이너 로그
`cri`	정규식	CRI (containerd) 로그
`syslog-rfc5424`	정규식	RFC 5424 Syslog
`syslog-rfc3164`	정규식	RFC 3164 Syslog
`apache`	정규식	Apache 액세스 로그
`nginx`	정규식	Nginx 액세스 로그
`logfmt`	Logfmt	key=value 쌍 형식

6.2 커스텀 정규식 파서 작성

# parsers.conf

# Nginx 에러 로그 파서
[PARSER]
    Name        nginx_error
    Format      regex
    Regex       ^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$
    Time_Key    time
    Time_Format %Y/%m/%d %H:%M:%S

# Spring Boot 로그 파서
[PARSER]
    Name        spring_boot
    Format      regex
    Regex       ^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L

# Apache Combined 로그 파서
[PARSER]
    Name        apache_combined
    Format      regex
    Regex       ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key    time
    Time_Format %d/%b/%Y:%H:%M:%S %z

YAML 형식으로도 동일하게 정의할 수 있다.

# parsers.yaml
parsers:
  - name: nginx_error
    format: regex
    regex: '^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$'
    time_key: time
    time_format: '%Y/%m/%d %H:%M:%S'

  - name: spring_boot
    format: regex
    regex: '^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$'
    time_key: time
    time_format: '%Y-%m-%dT%H:%M:%S.%L'

6.3 멀티라인 파서

Java Stack Trace와 같이 여러 라인에 걸친 로그를 하나의 레코드로 병합한다.

# 멀티라인 파서 정의
[MULTILINE_PARSER]
    Name          java_stacktrace
    Type          regex
    Flush_Timeout 1000
    # 첫 줄 패턴: 타임스탬프로 시작
    Rule          "start_state"  "/^\d{4}-\d{2}-\d{2}/"  "cont"
    # 연속 줄 패턴: 공백이나 Caused by로 시작
    Rule          "cont"         "/^\s+|^Caused by:/"      "cont"

[MULTILINE_PARSER]
    Name          python_traceback
    Type          regex
    Flush_Timeout 1000
    Rule          "start_state"  "/^Traceback/"            "python_tb"
    Rule          "python_tb"    "/^\s+/"                   "python_tb"
    Rule          "python_tb"    "/^\w+Error/"              "end"

Input에서 멀티라인 파서를 적용하는 방법은 다음과 같다.

pipeline:
  inputs:
    - name: tail
      tag: app.java
      path: /var/log/app/application.log
      multiline.parser: java_stacktrace
      read_from_head: true

6.4 Time_Key와 Time_Format

로그 메시지에서 타임스탬프를 추출하여 레코드의 시간으로 사용한다.

[PARSER]
    Name        custom_time
    Format      regex
    Regex       ^(?<time>[^ ]+) (?<message>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%LZ
    Time_Keep   On    # 변환 후에도 time 필드 유지
    Time_Offset +0900  # KST 타임존

6.5 파서 테스트 및 디버깅

파서 동작을 검증하는 가장 쉬운 방법은 stdout Output을 사용하는 것이다.

# 파서 테스트용 설정
service:
  flush: 1
  log_level: debug

pipeline:
  inputs:
    - name: tail
      tag: test
      path: /tmp/test.log
      parser: my_custom_parser

  outputs:
    - name: stdout
      match: test
      format: json_lines

# 테스트 로그 생성 및 확인
echo '2026-03-01T12:00:00.000Z ERROR [main] App - Connection failed' >> /tmp/test.log
fluent-bit -c test.yaml

7. Filter 플러그인 상세

Filter 플러그인은 수집된 데이터를 가공하는 중간 단계이다. 여러 Filter를 순서대로 체이닝하여 복잡한 변환 파이프라인을 구성할 수 있다.

7.1 kubernetes: Pod 메타데이터 enrichment

Kubernetes 환경에서 가장 중요한 필터로, 컨테이너 로그에 Pod 이름, Namespace, Labels, Annotations 등의 메타데이터를 자동으로 추가한다.

pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on # JSON 로그를 최상위로 병합
      merge_log_key: log_parsed # 병합 키 이름
      keep_log: off # 원본 log 필드 제거
      k8s-logging.parser: on # Pod 어노테이션의 파서 설정 사용
      k8s-logging.exclude: on # Pod 어노테이션으로 로그 제외
      labels: on # Pod Labels 포함
      annotations: off # Pod Annotations 포함 여부
      buffer_size: 0 # API 응답 버퍼 (0=무제한)
      kube_meta_cache_ttl: 300 # 메타데이터 캐시 TTL (초)
      use_kubelet: false # kubelet API 사용 여부

enrichment 후 레코드 구조 예시는 다음과 같다.

{
  "log": "Connection established to database",
  "kubernetes": {
    "pod_name": "api-server-7d4b8c6f5-x2k9j",
    "namespace_name": "production",
    "pod_id": "abc-123-def",
    "container_name": "api-server",
    "container_image": "myregistry/api-server:v2.1.0",
    "labels": {
      "app": "api-server",
      "version": "v2.1.0",
      "team": "backend"
    },
    "host": "node-01"
  }
}

7.2 modify: 필드 추가/삭제/이름 변경

pipeline:
  filters:
    - name: modify
      match: "*"
      # 필드 추가
      add: environment production
      add: cluster_name main-cluster
      # 필드 이름 변경
      rename: log message
      rename: stream source
      # 필드 삭제
      remove: unwanted_field
      # 조건부 필드 추가 (키가 없을 때만)
      set: default_level INFO
      # 하드카피
      copy: source source_backup

7.3 grep: 정규식 기반 필터링

특정 패턴에 매칭되는 레코드만 통과시키거나 제외할 수 있다.

pipeline:
  filters:
    # ERROR 또는 WARN 레벨만 통과
    - name: grep
      match: app.*
      regex: level (ERROR|WARN)

    # healthcheck 경로 제외
    - name: grep
      match: access.*
      exclude: path /health

    # 특정 네임스페이스만 통과
    - name: grep
      match: kube.*
      regex: $kubernetes['namespace_name'] ^(production|staging)$

    # 여러 조건 조합 (AND)
    - name: grep
      match: app.*
      regex: level ERROR
      regex: message .*timeout.*

7.4 record_modifier: 레코드 변경

pipeline:
  filters:
    - name: record_modifier
      match: "*"
      record: hostname ${HOSTNAME}
      record: service_name my-application
      remove_key: unnecessary_field
      allowlist_key: timestamp
      allowlist_key: level
      allowlist_key: message

7.5 nest: 중첩 구조 변환

플랫 구조를 중첩 구조로 변환하거나 그 반대로 변환한다.

pipeline:
  filters:
    # 중첩(Nest): 플랫 -> 중첩
    - name: nest
      match: '*'
      operation: nest
      wildcard: 'app_*'
      nest_under: application
      # app_name, app_version -> application: { name, version }

    # 해제(Lift): 중첩 -> 플랫
    - name: nest
      match: '*'
      operation: lift
      nested_under: kubernetes
      # kubernetes: { pod_name, namespace } -> pod_name, namespace

7.6 lua: Lua 스크립트 기반 커스텀 변환

가장 유연한 변환 방법으로, Lua 스크립트로 복잡한 로직을 구현할 수 있다.

pipeline:
  filters:
    - name: lua
      match: app.*
      script: /fluent-bit/scripts/transform.lua
      call: process_log

-- /fluent-bit/scripts/transform.lua

function process_log(tag, timestamp, record)
    -- 로그 레벨 정규화
    if record["level"] then
        record["level"] = string.upper(record["level"])
    end

    -- 민감 정보 마스킹
    if record["message"] then
        record["message"] = string.gsub(
            record["message"],
            "%d%d%d%d%-%d%d%d%d%-%d%d%d%d%-%d%d%d%d",
            "****-****-****-****"
        )
    end

    -- 응답 시간에 따른 등급 추가
    if record["response_time"] then
        local rt = tonumber(record["response_time"])
        if rt > 5000 then
            record["performance"] = "critical"
        elseif rt > 1000 then
            record["performance"] = "slow"
        else
            record["performance"] = "normal"
        end
    end

    -- 타임스탬프 필드 추가
    record["processed_at"] = os.date("!%Y-%m-%dT%H:%M:%SZ")

    -- 2 = MODIFIED
    return 2, timestamp, record
end

Lua 콜백 함수의 반환값은 다음과 같다.

코드	의미
-1	레코드 삭제 (Drop)
0	원본 유지 (Keep)
1	타임스탬프만 변경
2	레코드 변경 (Modified)

7.7 rewrite_tag: 태그 기반 라우팅 변경

레코드의 내용에 따라 Tag를 동적으로 변경하여 다른 Output으로 라우팅할 수 있다.

pipeline:
  filters:
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
      rule: $kubernetes['namespace_name'] ^(staging)$    stg.$TAG  false
      rule: $level                        ^(ERROR)$      alert.$TAG false

Rule 문법: rule: $KEY REGEX NEW_TAG KEEP_ORIGINAL

$KEY: 매칭할 필드
REGEX: 정규식 패턴
NEW_TAG: 새로운 Tag
KEEP_ORIGINAL: 원본도 유지할지 여부 (true/false)

7.8 throttle: 로그 속도 제한

과도한 로그 생성을 제한하여 시스템을 보호한다.

pipeline:
  filters:
    - name: throttle
      match: app.*
      rate: 1000 # 윈도우당 허용량
      window: 5 # 윈도우 크기 (초)
      interval: 1s # 평가 주기
      print_status: true # 상태 출력

7.9 multiline: 멀티라인 로그 병합

Filter 단계에서 멀티라인 로그를 병합하는 방식이다 (Input의 multiline.parser와는 별도).

pipeline:
  filters:
    - name: multiline
      match: app.*
      multiline.parser: java_stacktrace
      multiline.key_content: log

8. Output 플러그인 상세

Output 플러그인은 처리된 데이터를 최종 목적지로 전송한다. 하나의 Fluent Bit 인스턴스에서 여러 Output을 동시에 사용할 수 있다.

8.1 elasticsearch / opensearch

pipeline:
  outputs:
    # Elasticsearch
    - name: es
      match: kube.*
      host: ${ES_HOST}
      port: 9200
      index: logs
      type: _doc
      http_user: ${ES_USER}
      http_passwd: ${ES_PASSWORD}
      logstash_format: on
      logstash_prefix: kube-logs
      logstash_dateformat: %Y.%m.%d
      time_key: '@timestamp'
      include_tag_key: true
      tag_key: fluentbit_tag
      generate_id: on # 중복 방지 ID 생성
      buffer_size: 512KB
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem
      retry_limit: 5
      workers: 2
      suppress_type_name: on # ES 8.x 호환

    # OpenSearch
    - name: opensearch
      match: app.*
      host: opensearch.logging.svc
      port: 9200
      index: app-logs
      http_user: admin
      http_passwd: ${OPENSEARCH_PASSWORD}
      tls: on
      suppress_type_name: on
      trace_output: off

주요 설정 항목

설정	기본값	설명
`Host`	127.0.0.1	Elasticsearch 호스트
`Port`	9200	포트 번호
`Index`	fluent-bit	인덱스 이름
`Logstash_Format`	Off	날짜 기반 인덱스 사용
`Logstash_Prefix`	logstash	인덱스 접두어
`HTTP_User`	-	Basic Auth 사용자
`HTTP_Passwd`	-	Basic Auth 비밀번호
`TLS`	Off	TLS 활성화
`Generate_ID`	Off	문서 ID 자동 생성
`Workers`	0	병렬 Worker 수
`Retry_Limit`	1	재시도 횟수 (False=무한)
`Suppress_Type_Name`	Off	ES 8.x _type 비활성화

8.2 loki: Grafana Loki 연동

pipeline:
  outputs:
    - name: loki
      match: kube.*
      host: loki-gateway.logging.svc
      port: 3100
      uri: /loki/api/v1/push
      tenant_id: my-tenant
      labels: job=fluent-bit
      label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
      label_map_path: /fluent-bit/etc/loki-labelmap.json
      remove_keys: kubernetes,stream
      auto_kubernetes_labels: on
      line_format: json
      drop_single_key: on
      http_user: ${LOKI_USER}
      http_passwd: ${LOKI_PASSWORD}
      tls: on
      tls.verify: on
      workers: 2

Loki Label Map 파일 예시

{
  "kubernetes": {
    "namespace_name": "namespace",
    "pod_name": "pod",
    "container_name": "container",
    "labels": {
      "app": "app"
    }
  },
  "stream": "stream"
}

8.3 s3: AWS S3 저장

pipeline:
  outputs:
    - name: s3
      match: archive.*
      region: ap-northeast-2
      bucket: my-log-bucket
      s3_key_format: /logs/$TAG/%Y/%m/%d/%H/$UUID.gz
      s3_key_format_tag_delimiters: .
      total_file_size: 50M # 이 크기에 도달하면 업로드
      upload_timeout: 10m # 크기 미달이라도 이 시간 후 업로드
      use_put_object: on
      compression: gzip
      content_type: application/gzip
      store_dir: /tmp/fluent-bit-s3 # 로컬 버퍼 디렉터리
      store_dir_limit_size: 512M
      retry_limit: 5
      # IAM 역할 사용 시 (IRSA / EKS Pod Identity)
      role_arn: arn:aws:iam::123456789012:role/fluent-bit-s3
      # 정적 자격증명 (비권장)
      # endpoint:
      # sts_endpoint:

8.4 kafka: Kafka 연동

pipeline:
  outputs:
    - name: kafka
      match: app.*
      brokers: kafka-0:9092,kafka-1:9092,kafka-2:9092
      topics: application-logs
      timestamp_key: '@timestamp'
      timestamp_format: iso8601
      format: json
      message_key: log
      queue_full_retries: 10
      rdkafka.request.required.acks: 1
      rdkafka.log.connection.close: false
      rdkafka.compression.codec: snappy

8.5 cloudwatch_logs: AWS CloudWatch

pipeline:
  outputs:
    - name: cloudwatch_logs
      match: kube.*
      region: ap-northeast-2
      log_group_name: /eks/my-cluster/containers
      log_group_template: /eks/my-cluster/$kubernetes['namespace_name']
      log_stream_prefix: from-fluent-bit-
      log_stream_template: $kubernetes['pod_name']
      auto_create_group: on
      extra_user_agent: fluent-bit
      retry_limit: 5
      workers: 1

8.6 stdout: 디버깅용 표준 출력

pipeline:
  outputs:
    - name: stdout
      match: '*'
      format: json_lines # json_lines | msgpack

8.7 forward: Fluentd로 전달

pipeline:
  outputs:
    - name: forward
      match: '*'
      host: fluentd-aggregator.logging.svc
      port: 24224
      time_as_integer: off
      send_options: true
      require_ack_response: true
      # TLS
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem
      # 공유 키 인증
      shared_key: my-shared-secret
      self_hostname: fluent-bit-node01

8.8 prometheus_exporter: 메트릭 노출

pipeline:
  outputs:
    - name: prometheus_exporter
      match: metrics.*
      host: 0.0.0.0
      port: 2021
      add_label: app fluent-bit

9. Buffer와 Backpressure 관리

대규모 환경에서는 로그 생성 속도가 전송 속도를 초과할 수 있다. Fluent Bit의 Buffer와 Backpressure 관리를 올바르게 설정하면 데이터 손실 없이 안정적으로 운영할 수 있다.

9.1 메모리 버퍼 vs 파일시스템 버퍼

항목	메모리 버퍼	파일시스템 버퍼
저장 위치	RAM	디스크 + RAM (하이브리드)
속도	매우 빠름	상대적으로 느림
데이터 안전성	프로세스 종료 시 손실	디스크에 보존
용량 한계	RAM 크기 제한	디스크 크기까지 확장
설정 복잡도	단순	추가 설정 필요
적합 환경	개발/테스트	프로덕션

9.2 Service 레벨 설정

service:
  flush: 1
  log_level: info
  # 파일시스템 버퍼 활성화
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal # normal | full
  storage.checksum: off # 데이터 무결성 검증
  storage.max_chunks_up: 128 # 메모리에 올릴 최대 청크 수
  storage.backlog.mem_limit: 5M # 백로그 메모리 한도
  storage.metrics: on # 스토리지 메트릭 활성화

9.3 Input 레벨 버퍼 설정

pipeline:
  inputs:
    # 메모리 버퍼 사용 (기본)
    - name: tail
      tag: app.mem
      path: /var/log/app/*.log
      mem_buf_limit: 10MB # 메모리 버퍼 한도

    # 파일시스템 버퍼 사용
    - name: tail
      tag: app.fs
      path: /var/log/app/*.log
      storage.type: filesystem # filesystem | memory
      storage.pause_on_chunks_overlimit: off # 한도 초과 시 디스크에 계속 기록

9.4 Output 레벨 버퍼 설정

pipeline:
  outputs:
    - name: es
      match: '*'
      host: elasticsearch
      port: 9200
      storage.total_limit_size: 1G # Output별 파일시스템 버퍼 총량 제한
      retry_limit: 10 # 재시도 횟수
      # retry_limit: false          # 무한 재시도

9.5 Backpressure 메커니즘

Backpressure는 Output이 데이터를 충분히 빠르게 전송하지 못할 때 발생한다. Fluent Bit은 다음과 같은 단계로 Backpressure를 처리한다.

  [Input: 데이터 수집]
         |
         v
  [메모리 버퍼에 저장]
         |
    메모리 한도 도달?
    /              \
   No              Yes
   |                |
   v                v
 [계속 수집]   storage.type=filesystem?
                /              \
              No               Yes
              |                 |
              v                 v
       [Input 일시 정지]   [디스크에 기록]
       (데이터 손실 위험)        |
                          디스크 한도 도달?
                          /              \
                        No              Yes
                        |                |
                        v                v
                   [계속 기록]     [Input 일시 정지]

권장 Backpressure 설정 (프로덕션)

service:
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128
  storage.backlog.mem_limit: 10M

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      storage.type: filesystem
      storage.pause_on_chunks_overlimit: off

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch
      storage.total_limit_size: 5G
      retry_limit: false
      net.keepalive: on
      net.keepalive_idle_timeout: 30

9.6 Mem_Buf_Limit vs storage.max_chunks_up

설정	적용 범위	버퍼 모드	동작
`Mem_Buf_Limit`	Input별	memory만	한도 초과 시 Input 일시 정지
`storage.max_chunks_up`	전역 (SERVICE)	filesystem	메모리 내 청크 수 제한
`storage.total_limit_size`	Output별	filesystem	디스크 버퍼 총량 제한

주의: storage.type: filesystem을 사용하면 Mem_Buf_Limit은 효과가 없다. 대신 storage.max_chunks_up이 메모리 내 청크 수를 제어한다.

10. Kubernetes 연동 완벽 가이드

10.1 DaemonSet 배포 아키텍처

+---------------------------------------------------------------------+
|                      Kubernetes Cluster                             |
|                                                                     |
|  +-------------------+  +-------------------+  +-------------------+|
|  |     Node 1        |  |     Node 2        |  |     Node 3        ||
|  |                    |  |                    |  |                    ||
|  | +------+ +------+ |  | +------+ +------+ |  | +------+ +------+ ||
|  | |Pod A | |Pod B | |  | |Pod C | |Pod D | |  | |Pod E | |Pod F | ||
|  | +--+---+ +--+---+ |  | +--+---+ +--+---+ |  | +--+---+ +--+---+ ||
|  |    |        |      |  |    |        |      |  |    |        |      ||
|  |    v        v      |  |    v        v      |  |    v        v      ||
|  | /var/log/containers|  | /var/log/containers|  | /var/log/containers||
|  |         |          |  |         |          |  |         |          ||
|  |    +----+-----+    |  |    +----+-----+    |  |    +----+-----+    ||
|  |    |Fluent Bit|    |  |    |Fluent Bit|    |  |    |Fluent Bit|    ||
|  |    |(DaemonSet)|   |  |    |(DaemonSet)|   |  |    |(DaemonSet)|   ||
|  |    +----+-----+    |  |    +----+-----+    |  |    +----+-----+    ||
|  +---------|----------+  +---------|----------+  +---------|----------+|
|            |                       |                       |           |
|            +----------+------------+----------+------------+           |
|                       |                       |                        |
|               +-------v--------+     +--------v--------+              |
|               | Elasticsearch  |     |  Grafana Loki   |              |
|               +----------------+     +-----------------+              |
+---------------------------------------------------------------------+

10.2 Helm Chart 설치

# 저장소 추가
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

# 기본 설치
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace

# 설치 확인
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl get ds -n logging

10.3 values.yaml 커스터마이징

# custom-values.yaml
kind: DaemonSet

image:
  repository: cr.fluentbit.io/fluent/fluent-bit
  tag: '3.2.2'
  pullPolicy: IfNotPresent

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

tolerations:
  - operator: Exists # 모든 노드에 배포 (master 포함)

serviceAccount:
  create: true
  annotations:
    # IRSA (EKS)
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/fluent-bit

# 볼륨 마운트
volumeMounts:
  - name: varlog
    mountPath: /var/log
  - name: varlibdockercontainers
    mountPath: /var/lib/docker/containers
    readOnly: true
  - name: etcmachineid
    mountPath: /etc/machine-id
    readOnly: true

volumes:
  - name: varlog
    hostPath:
      path: /var/log
  - name: varlibdockercontainers
    hostPath:
      path: /var/lib/docker/containers
  - name: etcmachineid
    hostPath:
      path: /etc/machine-id

# 환경변수
env:
  - name: ELASTICSEARCH_HOST
    value: 'elasticsearch-master.logging.svc.cluster.local'
  - name: ELASTICSEARCH_PORT
    value: '9200'

# Fluent Bit 설정
config:
  service: |
    [SERVICE]
        Flush         5
        Log_Level     info
        Daemon        off
        Parsers_File  /fluent-bit/etc/parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
        storage.path  /var/log/fluent-bit/buffer/
        storage.sync  normal
        storage.max_chunks_up 128
        Hot_Reload    On

  inputs: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        multiline.parser  docker, cri
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10
        storage.type      filesystem

  filters: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Labels              On
        Annotations         Off

  outputs: |
    [OUTPUT]
        Name            es
        Match           kube.*
        Host            ${ELASTICSEARCH_HOST}
        Port            ${ELASTICSEARCH_PORT}
        Logstash_Format On
        Logstash_Prefix kube
        Retry_Limit     False
        Suppress_Type_Name On

# Prometheus 메트릭 수집을 위한 ServiceMonitor
serviceMonitor:
  enabled: true
  interval: 30s
  scrapeTimeout: 10s

# Liveness/Readiness Probe
livenessProbe:
  httpGet:
    path: /
    port: http
readinessProbe:
  httpGet:
    path: /api/v1/health
    port: http

10.4 컨테이너 로그 경로와 CRI 파싱

Kubernetes에서 컨테이너 로그는 다음 경로에 저장된다.

/var/log/containers/<pod-name>_<namespace>_<container-name>-<container-id>.log
  -> symlink -> /var/log/pods/<namespace>_<pod-name>_<pod-uid>/<container-name>/0.log

CRI 로그 포맷 (containerd / CRI-O)

2026-03-01T12:00:00.123456789Z stdout F This is a log message
2026-03-01T12:00:00.123456789Z stderr P This is a partial log line

필드	설명
`2026-03-01T12:00:00.123456789Z`	RFC 3339 나노초 타임스탬프
`stdout` / `stderr`	스트림 유형
`F` / `P`	Full(완전) / Partial(부분) 로그 플래그
나머지	로그 메시지 본문

Fluent Bit은 multiline.parser: docker, cri를 지정하면 Docker 형식과 CRI 형식을 자동으로 감지하여 파싱한다.

10.5 Namespace별 로그 라우팅

# Namespace에 따라 다른 Output으로 라우팅
pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # 프로덕션 네임스페이스 태그 변경
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
      rule: $kubernetes['namespace_name'] ^(staging)$    stg.$TAG  false
      rule: $kubernetes['namespace_name'] ^(monitoring)$ mon.$TAG  false

  outputs:
    # 프로덕션 로그 -> Elasticsearch
    - name: es
      match: prod.*
      host: es-production.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-logs

    # 스테이징 로그 -> Loki
    - name: loki
      match: stg.*
      host: loki.logging.svc
      port: 3100
      labels:
        env=staging

    # 모니터링 로그 -> S3 아카이빙
    - name: s3
      match: mon.*
      bucket: monitoring-logs-archive
      region: ap-northeast-2
      total_file_size: 100M
      upload_timeout: 10m
      compression: gzip

    # 기타 로그 -> 기본 Elasticsearch
    - name: es
      match: kube.*
      host: es-default.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: default-logs

10.6 멀티테넌트 로그 분리

pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # 팀 Label 기반 태그 재작성
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['labels']['team'] ^(backend)$  team.backend.$TAG  false
      rule: $kubernetes['labels']['team'] ^(frontend)$ team.frontend.$TAG false
      rule: $kubernetes['labels']['team'] ^(data)$     team.data.$TAG     false

  outputs:
    # 백엔드 팀 -> 전용 Elasticsearch 인덱스
    - name: es
      match: team.backend.*
      host: elasticsearch.logging.svc
      logstash_format: on
      logstash_prefix: team-backend

    # 프론트엔드 팀 -> 전용 Loki tenant
    - name: loki
      match: team.frontend.*
      host: loki.logging.svc
      tenant_id: frontend-team
      labels:
        team=frontend

    # 데이터 팀 -> S3 + Elasticsearch 듀얼
    - name: es
      match: team.data.*
      host: elasticsearch.logging.svc
      logstash_format: on
      logstash_prefix: team-data

    - name: s3
      match: team.data.*
      bucket: data-team-logs
      region: ap-northeast-2
      compression: gzip

10.7 ServiceAccount 및 RBAC 설정

# fluent-bit-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit
rules:
  - apiGroups: ['']
    resources:
      - namespaces
      - pods
      - pods/logs
      - nodes
      - nodes/proxy
    verbs: ['get', 'list', 'watch']
  - apiGroups: ['']
    resources:
      - events
    verbs: ['get', 'list', 'watch']
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit
subjects:
  - kind: ServiceAccount
    name: fluent-bit
    namespace: logging

11. 실전 파이프라인 예제

11.1 예제 1: K8s 로그를 Elasticsearch + Kibana로

가장 일반적인 EFK(Elasticsearch + Fluent Bit + Kibana) 스택 구성이다.

# fluent-bit-efk.yaml
service:
  flush: 5
  log_level: info
  parsers_file: /fluent-bit/etc/parsers.conf
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      mem_buf_limit: 5MB
      skip_long_lines: on
      refresh_interval: 10
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on
      keep_log: off
      k8s-logging.parser: on
      k8s-logging.exclude: on
      labels: on

    # kube-system, logging 네임스페이스 로그 제외
    - name: grep
      match: kube.*
      exclude: $kubernetes['namespace_name'] ^(kube-system|logging)$

    # 환경 정보 추가
    - name: modify
      match: kube.*
      add: cluster_name my-eks-cluster
      add: environment production

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch-master.logging.svc.cluster.local
      port: 9200
      http_user: elastic
      http_passwd: ${ES_PASSWORD}
      logstash_format: on
      logstash_prefix: kube-logs
      logstash_dateformat: "%Y.%m.%d"
      time_key: "@timestamp"
      include_tag_key: true
      tag_key: fluentbit_tag
      suppress_type_name: on
      generate_id: on
      retry_limit: false
      workers: 2
      tls: on
      tls.verify: on
      tls.ca_file: /certs/es-ca.pem

11.2 예제 2: K8s 로그를 Grafana Loki + Grafana로

# fluent-bit-loki.yaml
service:
  flush: 5
  log_level: info
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      mem_buf_limit: 5MB
      skip_long_lines: on

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      keep_log: off
      labels: on
      annotations: off

  outputs:
    - name: loki
      match: kube.*
      host: loki-gateway.logging.svc.cluster.local
      port: 3100
      uri: /loki/api/v1/push
      labels: job=fluent-bit,cluster=my-cluster
      label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
      auto_kubernetes_labels: on
      line_format: json
      workers: 2
      retry_limit: false

11.3 예제 3: S3 아카이빙 + Elasticsearch 듀얼 출력

하나의 입력을 두 개의 Output으로 동시에 전송하는 패턴이다.

# fluent-bit-dual.yaml
service:
  flush: 5
  log_level: info
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      keep_log: off
      labels: on

  outputs:
    # 실시간 검색/분석용 Elasticsearch
    - name: es
      match: kube.*
      host: elasticsearch.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: kube-logs
      suppress_type_name: on
      retry_limit: false
      workers: 2

    # 장기 보관용 S3 아카이빙
    - name: s3
      match: kube.*
      region: ap-northeast-2
      bucket: log-archive-bucket
      s3_key_format: /kubernetes/$TAG/%Y/%m/%d/%H/$UUID.gz
      s3_key_format_tag_delimiters: .
      total_file_size: 100M
      upload_timeout: 10m
      compression: gzip
      content_type: application/gzip
      store_dir: /tmp/fluent-bit-s3
      store_dir_limit_size: 1G
      use_put_object: on
      retry_limit: 5
      workers: 1

11.4 예제 4: 네임스페이스별 필터링 및 다중 목적지 라우팅

# fluent-bit-routing.yaml
service:
  flush: 5
  log_level: info
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # Namespace별 태그 재작성
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$  route.prod.$TAG  false
      rule: $kubernetes['namespace_name'] ^(staging)$     route.stg.$TAG   false
      rule: $kubernetes['namespace_name'] ^(kube-system)$ route.sys.$TAG   false

    # 프로덕션에서 ERROR만 추출
    - name: rewrite_tag
      match: route.prod.*
      rule: $log ^.*ERROR.*$ alert.prod.$TAG true

  outputs:
    # 프로덕션 전체 로그 -> Elasticsearch
    - name: es
      match: route.prod.*
      host: es-prod.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-all
      suppress_type_name: on
      workers: 2

    # 프로덕션 ERROR 알림 -> 별도 인덱스
    - name: es
      match: alert.prod.*
      host: es-prod.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-alerts
      suppress_type_name: on

    # 스테이징 로그 -> Loki (비용 효율)
    - name: loki
      match: route.stg.*
      host: loki.logging.svc
      port: 3100
      labels:
        env=staging
      auto_kubernetes_labels: on
      line_format: json

    # 시스템 로그 -> S3 아카이빙 (장기 보관)
    - name: s3
      match: route.sys.*
      bucket: system-logs-archive
      region: ap-northeast-2
      total_file_size: 50M
      upload_timeout: 15m
      compression: gzip
      s3_key_format: /kube-system/%Y/%m/%d/$UUID.gz

    # 미분류 로그 -> 기본 Elasticsearch
    - name: es
      match: kube.*
      host: es-default.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: default-logs
      suppress_type_name: on

12. 성능 튜닝

12.1 Workers 설정

Output 플러그인에 Workers 파라미터를 설정하면 병렬 전송이 가능하다. 각 Worker는 독립적인 스레드로 동작한다.

pipeline:
  outputs:
    - name: es
      match: '*'
      host: elasticsearch
      port: 9200
      workers: 4 # 4개 병렬 Worker
      net.keepalive: on
      net.keepalive_idle_timeout: 30

Workers 설정 가이드라인

상황	권장 Workers	비고
낮은 처리량	0-1	기본값으로 충분
중간 처리량	2-4	대부분의 프로덕션 환경
높은 처리량	4-8	CPU 코어 수 고려
매우 높은 처리량	8 이상	네트워크/목적지 병목 확인 필요

12.2 Flush Interval 최적화

Flush 값은 초 단위로, 버퍼에서 Output으로 데이터를 전송하는 주기를 결정한다.

service:
  flush: 1 # 1초마다 플러시 (실시간성 중시)
  # flush: 5  # 5초마다 플러시 (처리량 중시)

Flush 값	특성
1초	낮은 지연시간, 높은 CPU 사용
5초	균형잡힌 설정 (기본 권장)
10초 이상	높은 처리량, 큰 배치 크기

12.3 Buffer 크기 조정

pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      # 높은 처리량 환경
      buffer_chunk_size: 512KB # 청크 단위 크기 (기본 32KB)
      buffer_max_size: 5MB # 최대 버퍼 크기 (기본 32KB)
      mem_buf_limit: 50MB # 메모리 버퍼 한도

  outputs:
    - name: es
      match: '*'
      buffer_size: 512KB # HTTP 버퍼 크기

12.4 파이프라인 병렬화

높은 처리량이 필요한 환경에서는 Input과 Output을 분리하여 병렬 파이프라인을 구성할 수 있다.

pipeline:
  inputs:
    # 애플리케이션 로그 파이프라인
    - name: tail
      tag: app.*
      path: /var/log/containers/app-*.log
      multiline.parser: docker, cri
      threaded: on # 별도 스레드에서 실행

    # 시스템 로그 파이프라인
    - name: tail
      tag: sys.*
      path: /var/log/containers/kube-*.log
      multiline.parser: docker, cri
      threaded: on

  outputs:
    - name: es
      match: app.*
      host: elasticsearch
      workers: 4

    - name: es
      match: sys.*
      host: elasticsearch
      workers: 2

12.5 Hot Reload

v2.1부터 지원되는 Hot Reload 기능으로, 서비스 중단 없이 설정을 다시 로드할 수 있다.

# Hot Reload 활성화
service:
  hot_reload: on

리로드를 트리거하는 방법은 세 가지이다.

# 방법 1: SIGHUP 시그널
kill -SIGHUP $(pidof fluent-bit)

# 방법 2: HTTP API (v4.0+)
curl -X POST http://localhost:2020/api/v2/reload

# 방법 3: 커맨드라인 옵션
fluent-bit -c fluent-bit.yaml -Y   # --enable-hot-reload

Hot Reload 시 주의사항

버퍼에 있는 데이터는 보존된다
파일시스템 버퍼의 경우 디스크의 데이터도 유지된다
새 설정에 문법 오류가 있으면 리로드가 실패하고 기존 설정이 유지된다
SIGHUP은 Windows에서 지원되지 않는다

12.6 메모리 사용량 모니터링

# Fluent Bit 프로세스 메모리 확인
ps aux | grep fluent-bit

# HTTP API로 내부 메트릭 확인
curl -s http://localhost:2020/api/v1/storage | jq .

# Prometheus 메트릭으로 확인
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep fluentbit_input

12.7 성능 벤치마크 참고

다음은 일반적인 환경에서의 Fluent Bit 성능 참고치이다 (하드웨어, 네트워크, 설정에 따라 차이가 있다).

시나리오	처리량	CPU 사용률	메모리
단순 전달 (tail 입력, stdout 출력)	약 100K events/s	5-10% (1 core)	5-10MB
K8s Filter + ES 출력	약 40-60K events/s	15-25% (1 core)	30-50MB
K8s Filter + Lua 변환 + ES 출력	약 20-40K events/s	25-40% (1 core)	40-80MB
복합 파이프라인 (Filter 다수 + 듀얼 출력)	약 15-30K events/s	30-50% (1 core)	50-100MB

v4.1의 SIMD 최적화: JSON 인코딩을 SIMD(Single Instruction, Multiple Data) 명령어로 처리하여 JSON 변환 성능이 2.5배 향상되었다.

13. 모니터링 및 Observability

13.1 내장 HTTP 모니터링 엔드포인트

Fluent Bit은 내장 HTTP 서버를 통해 자체 상태를 모니터링할 수 있다.

service:
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  health_check: on
  hc_errors_count: 5 # 이 수 이상 에러 발생 시 unhealthy
  hc_retry_failure_count: 5 # 이 수 이상 재시도 실패 시 unhealthy
  hc_period: 60 # 헬스체크 주기 (초)

사용 가능한 엔드포인트

엔드포인트	설명
`/`	Fluent Bit 빌드 정보
`/api/v1/health`	헬스 체크 (200=OK, 500=Unhealthy)
`/api/v1/metrics`	JSON 형식 메트릭
`/api/v1/metrics/prometheus`	Prometheus 형식 메트릭
`/api/v2/metrics`	v2 메트릭 엔드포인트
`/api/v2/metrics/prometheus`	v2 Prometheus 메트릭
`/api/v1/storage`	스토리지/버퍼 상태
`/api/v2/reload`	Hot Reload 트리거 (POST)
`/api/v1/uptime`	가동 시간

# 헬스 체크
curl -s http://localhost:2020/api/v1/health
# 응답: ok (HTTP 200) 또는 error (HTTP 500)

# 스토리지 상태 확인
curl -s http://localhost:2020/api/v1/storage | jq .
# {
#   "storage_layer": {
#     "chunks": {
#       "total_chunks": 15,
#       "mem_chunks": 10,
#       "fs_chunks": 5,
#       "fs_chunks_up": 3,
#       "fs_chunks_down": 2
#     }
#   }
# }

# Prometheus 메트릭 확인
curl -s http://localhost:2020/api/v2/metrics/prometheus

13.2 Prometheus 메트릭 수집 구성

Fluent Bit의 내부 메트릭을 Prometheus로 수집하고 Grafana에서 시각화하는 구성이다.

# Fluent Bit 자체 메트릭을 Prometheus로 노출
pipeline:
  inputs:
    - name: fluentbit_metrics
      tag: fb.metrics
      scrape_interval: 30

  outputs:
    - name: prometheus_exporter
      match: fb.metrics
      host: 0.0.0.0
      port: 2021

Kubernetes ServiceMonitor 설정

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: fluent-bit
  namespace: logging
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: fluent-bit
  endpoints:
    - port: http
      path: /api/v2/metrics/prometheus
      interval: 30s
      scrapeTimeout: 10s
  namespaceSelector:
    matchNames:
      - logging

주요 Prometheus 메트릭

메트릭	타입	설명
`fluentbit_input_records_total`	Counter	Input별 수집 레코드 수
`fluentbit_input_bytes_total`	Counter	Input별 수집 바이트 수
`fluentbit_output_proc_records_total`	Counter	Output별 처리 레코드 수
`fluentbit_output_proc_bytes_total`	Counter	Output별 처리 바이트 수
`fluentbit_output_errors_total`	Counter	Output별 에러 수
`fluentbit_output_retries_total`	Counter	Output별 재시도 수
`fluentbit_output_retries_failed_total`	Counter	Output별 재시도 실패 수
`fluentbit_filter_records_total`	Counter	Filter별 처리 레코드 수
`fluentbit_uptime`	Gauge	가동 시간 (초)
`fluentbit_storage_chunks`	Gauge	스토리지 청크 수

13.3 Grafana 대시보드

Grafana에서 Fluent Bit 모니터링 대시보드를 구성할 때 포함해야 할 핵심 패널은 다음과 같다.

대시보드 패널 구성

패널	PromQL 예시	용도
Input 처리량	`rate(fluentbit_input_records_total[5m])`	초당 수집 레코드 수
Output 처리량	`rate(fluentbit_output_proc_records_total[5m])`	초당 전송 레코드 수
Output 에러율	`rate(fluentbit_output_errors_total[5m])`	초당 에러 발생 수
재시도 비율	`rate(fluentbit_output_retries_total[5m])`	재시도 추세
버퍼 사용량	`fluentbit_storage_chunks`	현재 버퍼 청크 수
가동 시간	`fluentbit_uptime`	프로세스 안정성
Input/Output 차이	`rate(input[5m]) - rate(output[5m])`	Backpressure 감지

Grafana Labs 커뮤니티 대시보드(ID: 7752)를 Import하면 기본적인 Fluent Bit 모니터링 대시보드를 빠르게 구성할 수 있다.

14. 트러블슈팅 가이드

14.1 로그가 수집되지 않는 경우

체크리스트

# 1. Fluent Bit 프로세스 상태 확인
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl logs -n logging <fluent-bit-pod> --tail=50

# 2. 설정 파일 문법 검증
fluent-bit -c fluent-bit.yaml --dry-run

# 3. 로그 파일 경로 확인
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/containers/

# 4. DB 파일(offset) 확인
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/flb_kube.db

# 5. 권한 확인
kubectl exec -n logging <fluent-bit-pod> -- cat /var/log/containers/<target-log>

일반적인 원인과 해결 방법

원인	증상	해결 방법
파일 경로 오류	Input에서 레코드 수 0	`Path` 확인, 와일드카드 패턴 검증
권한 부족	Permission denied 에러	SecurityContext, hostPath 권한 확인
DB 파일 손상	offset이 파일 끝을 가리킴	DB 파일 삭제 후 재시작
Parser 불일치	파싱 실패, 빈 레코드	`stdout` Output으로 원본 확인
Tag 매칭 실패	Filter/Output에 도달 안 함	`Match` 패턴과 `Tag` 일치 확인

14.2 메모리 누수 / OOM

# 메모리 제한 설정
pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      mem_buf_limit: 5MB # Input 메모리 한도
      skip_long_lines: on # 긴 라인 스킵
      buffer_chunk_size: 32KB # 청크 크기 제한
      buffer_max_size: 32KB # 최대 버퍼

service:
  storage.path: /var/log/fluent-bit/buffer/
  storage.max_chunks_up: 64 # 메모리 청크 수 제한 (기본 128)

OOM 방지 체크리스트

Mem_Buf_Limit 또는 storage.max_chunks_up이 설정되어 있는지 확인
Skip_Long_Lines: On 활성화
불필요한 Filter 제거 (특히 Lua에서 큰 테이블 생성 시)
Kubernetes Filter의 Buffer_Size 확인 (0=무제한)
Kubernetes Resource Limits 적절히 설정

14.3 Backpressure로 인한 로그 손실

# Backpressure 발생 확인
curl -s http://localhost:2020/api/v1/storage | jq .

# 메트릭으로 확인
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep -E "retries|errors|dropped"

Backpressure 발생 시 대응 순서

Output의 Workers 수를 늘려 병렬 전송 강화
Flush 주기를 줄여 더 자주 전송 (예: 5 -> 1)
storage.type: filesystem으로 전환하여 디스크 버퍼 활용
storage.total_limit_size를 충분히 설정
목적지(Elasticsearch 등)의 처리 능력 확인 및 증설

# Backpressure 대응 설정 예시
service:
  flush: 1
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 256

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      storage.type: filesystem
      storage.pause_on_chunks_overlimit: off

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch
      port: 9200
      workers: 4
      storage.total_limit_size: 10G
      retry_limit: false
      net.keepalive: on
      net.keepalive_idle_timeout: 15

14.4 TLS/인증 오류

# TLS 연결 테스트
kubectl exec -n logging <fluent-bit-pod> -- \
  curl -v --cacert /certs/ca.pem https://elasticsearch:9200

# 인증서 만료 확인
kubectl exec -n logging <fluent-bit-pod> -- \
  openssl x509 -in /certs/ca.pem -noout -enddate

일반적인 TLS 오류 및 해결

에러	원인	해결
`SSL_ERROR_SYSCALL`	인증서 경로 오류	`tls.ca_file` 경로 확인
`certificate verify failed`	CA 인증서 불일치	올바른 CA 인증서 사용
`certificate has expired`	인증서 만료	인증서 갱신
`connection refused`	포트/호스트 오류	Host, Port, TLS 포트 확인
`401 Unauthorized`	인증 실패	`http_user`, `http_passwd` 확인

14.5 디버깅 방법

단계 1: Log Level을 debug로 설정

service:
  log_level: debug # error, warn, info, debug, trace

단계 2: stdout Output 추가

pipeline:
  outputs:
    # 디버깅용: 모든 레코드를 표준 출력으로
    - name: stdout
      match: '*'
      format: json_lines

    # 실제 Output
    - name: es
      match: '*'
      host: elasticsearch

단계 3: 파이프라인 단계별 확인

# Input만 테스트
fluent-bit -i tail -p path=/var/log/test.log -o stdout

# Parser 테스트
fluent-bit -i tail -p path=/var/log/test.log -p parser=json -o stdout

# 전체 설정 테스트 (dry-run)
fluent-bit -c fluent-bit.yaml --dry-run

단계 4: Kubernetes에서 실시간 로그 확인

# Fluent Bit 로그 스트리밍
kubectl logs -n logging -l app.kubernetes.io/name=fluent-bit -f --tail=100

# 특정 Pod의 로그
kubectl logs -n logging fluent-bit-xxxxx -f

# 이전 컨테이너 로그 (crash 시)
kubectl logs -n logging fluent-bit-xxxxx --previous

15. 운영 Best Practices

15.1 프로덕션 체크리스트

항목	권장 설정	이유
`storage.type`	filesystem	데이터 손실 방지
`storage.path`	별도 볼륨 마운트	디스크 I/O 분리
`storage.total_limit_size`	디스크 여유 공간의 50-70%	디스크 풀 방지
`Retry_Limit`	false (무한) 또는 충분한 값	일시적 장애 복구
`Workers`	2-4	병렬 전송 성능
`Hot_Reload`	on	무중단 설정 변경
`HTTP_Server`	on	모니터링 활성화
`health_check`	on	Kubernetes Probe 연동
`Skip_Long_Lines`	on	비정상 로그에 의한 장애 방지
Resource Limits	적절한 CPU/Memory	OOM 방지

15.2 보안 권장 사항

# Kubernetes Pod Security
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  capabilities:
    drop: ['ALL']

# 필요한 볼륨만 마운트
volumes:
  - name: varlog
    hostPath:
      path: /var/log
      type: ''
  - name: buffer
    emptyDir:
      sizeLimit: 2Gi

# TLS 강제 (Output)
pipeline:
  outputs:
    - name: es
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem

15.3 로그 로테이션 대응

Fluent Bit의 tail Input은 로그 로테이션을 자동으로 감지한다. 다만 다음 설정을 확인해야 한다.

pipeline:
  inputs:
    - name: tail
      path: /var/log/app/*.log
      db: /var/log/flb_app.db # 필수: offset 추적
      rotate_wait: 5 # 로테이션 후 대기 시간 (초)
      refresh_interval: 10 # 파일 목록 갱신 주기

16. 참고 자료

공식 문서 및 저장소

Fluent Bit 공식 문서 - 전체 설정 레퍼런스 및 가이드
Fluent Bit GitHub - 소스 코드 및 이슈 트래커
Fluent Bit Helm Charts - Kubernetes 배포용 Helm Chart
Fluent Bit Performance Tools - 벤치마크 도구

CNCF 관련 자료

비교 및 심화 자료

Fluent Bit Complete Guide: Architecture, Configuration, and Kubernetes Integration for the Lightweight Log Processor

1. Introduction to Fluent Bit
2. Core Architecture: Pipeline Structure
3. Installation Methods
4. Configuration File Structure
5. Input Plugin Details
6. Parser Configuration
7. Filter Plugin Details
8. Output Plugin Details
9. Buffer and Backpressure Management
10. Kubernetes Integration Complete Guide
11. Practical Pipeline Examples
12. Performance Tuning
13. Monitoring and Observability
14. Troubleshooting Guide
15. Operational Best Practices
16. References
Quiz

1. Introduction to Fluent Bit

1.1 What Is Fluent Bit?

Fluent Bit is an ultra-lightweight telemetry agent written in C that collects Logs, Metrics, and Traces from various sources, processes them, and delivers them to desired destinations. It boasts an extremely lightweight footprint with a binary size of approximately 450KB and memory usage under 1MB, making it widely applicable from embedded systems to large-scale Kubernetes clusters.

Fluent Bit is a CNCF (Cloud Native Computing Foundation) Graduated project, recognized at the same maturity level as Kubernetes, Prometheus, and Envoy. Under the Fluentd project umbrella, it achieved graduation status in 2019. As of 2024, it has been downloaded over 13 billion times on DockerHub, establishing itself as the de facto standard for cloud-native log collection.

In March 2024, Fluent Bit v3 was announced at KubeCon + CloudNativeCon EU. In December of the same year, v3.2 was released, followed by v4.0 in March 2025, continuously evolving with YAML standard configuration, Processor support, SIMD-based JSON encoding (2.5x performance improvement), and enhanced OpenTelemetry support.

1.2 Key Features

Ultra-lightweight: C-based, ~450KB binary, under 1MB memory usage
High performance: Asynchronous I/O, multi-threaded pipeline, SIMD optimization
Plugin architecture: Over 100 Input/Filter/Output plugins
Unified telemetry: Logs, Metrics, Traces handled by a single agent
YAML native: YAML is the standard configuration format from v3.2
Hot Reload: Configuration reload without service interruption (SIGHUP / HTTP API)
Cross-platform: Supports Linux, macOS, Windows, BSD, and embedded Linux

1.3 Fluent Bit vs Fluentd Comparison

Category	Fluent Bit	Fluentd
Language	C	Ruby + C
Binary Size	~450KB	~40MB
Memory Usage	~1MB	~30-40MB
Plugin Count	Over 100 (built-in)	Over 1,000 (including gems)
Performance	Very high	High
CPU Usage	Low	4x compared to Fluent Bit
Config Format	INI / YAML	Ruby DSL
Primary Use	Edge/node-level collection	Centralized log aggregation
Kubernetes	DaemonSet per node	Aggregator central deploy
CNCF Status	Graduated (under Fluentd)	Graduated
Best For	IoT, containers, edge	Large-scale aggregation, complex routing

Recommended architecture: The most widely used pattern is a hybrid approach where Fluent Bit is deployed as a DaemonSet on each node to collect logs, forwarding to a central Fluentd Aggregator when needed. However, as Fluent Bit capabilities continue to strengthen, cases where the entire pipeline is built with Fluent Bit alone without Fluentd are rapidly increasing.

1.4 Architecture Overview

+------------------------------------------------------------------+
|                        Fluent Bit Engine                         |
|                                                                  |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|  | Input  |-->| Parser |-->| Filter |-->| Buffer |-->| Output | |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|  | tail   |   | json   |   | k8s    |   | memory |   | es     | |
|  | systemd|   | regex  |   | grep   |   | filesys|   | loki   | |
|  | forward|   | logfmt |   | modify |   |        |   | s3     | |
|  | http   |   | cri    |   | lua    |   |        |   | kafka  | |
|  | tcp    |   | docker |   | nest   |   |        |   | stdout | |
|  +--------+   +--------+   +--------+   +--------+   +--------+ |
|                                                                  |
|  [Scheduler]  [Router / Tag Matching]  [HTTP Server / Monitoring]|
+------------------------------------------------------------------+

Fluent Bit data processing follows a Pipeline structure where each stage is clearly separated. This structure enables independent extension and replacement on a per-module basis.

2. Core Architecture: Pipeline Structure

2.1 Full Pipeline Flow

The Fluent Bit data processing pipeline consists of the following stages:

  [Data Source]
       |
       v
  +---------+     +---------+     +---------+     +---------+     +---------+
  |  INPUT  | --> | PARSER  | --> | FILTER  | --> | BUFFER  | --> | OUTPUT  |
  |         |     |         |     |         |     |         |     |         |
  | Data    |     | Unstruc |     | Data    |     | Memory/ |     | Final   |
  | Collect |     | -> Struc|     | Process |     | Disk    |     | Deliver |
  +---------+     +---------+     +---------+     +---------+     +---------+
       |               |               |               |               |
    Tag assign    Structuring      Enrichment      Reliability     Destination
                  conversion       Filtering       guarantee       (Tag match)

2.2 Role of Each Stage

Input

The entry point for data. It collects data from various sources such as files, system journals, network sockets, and Kubernetes events. Every input data item is assigned a Tag, which is used for subsequent routing.

Parser

Converts unstructured text data into structured data. It provides various parsers including JSON, Regex, Logfmt, Docker, and CRI, applied at the Input plugin stage.

Filter

The stage for processing collected data. It performs field addition/deletion, Kubernetes metadata enrichment, regex-based filtering, and Lua script transformations. Multiple Filters can be chained to compose complex transformation logic.

Buffer

Data that passes through Filters is stored in a buffer before being delivered to Output. Two buffer modes are supported: memory buffer and filesystem buffer. Using the filesystem buffer prevents data loss even during failures.

Router

Routes data to the appropriate Output based on Tag matching rules. It supports wildcard (*) matching and can simultaneously deliver a single input to multiple Outputs (fan-out).

Output

Transmits data to the final destination. It supports various backends including Elasticsearch, Loki, S3, Kafka, CloudWatch, and Prometheus. An automatic retry mechanism activates upon transmission failure.

2.3 Multi-Pipeline Structure

Fluent Bit can operate multiple independent pipelines simultaneously within a single instance. Each pipeline has its own unique combination of Input, Filter, and Output, with Tag-based routing separating different data flows.

Pipeline A: [tail: app-*.log] --tag:app--> [filter:k8s] --> [output:elasticsearch]
Pipeline B: [tail: sys-*.log] --tag:sys--> [filter:grep] --> [output:loki]
Pipeline C: [forward:24224]   --tag:fwd--> [filter:lua]  --> [output:s3]

This structure provides the following benefits:

Isolation: Independent operation between pipelines prevents failure propagation
Flexibility: Different processing logic and destinations per use case
Efficiency: A single agent handles multiple data flows

3. Installation Methods

3.1 Linux (apt/yum)

Ubuntu/Debian (apt)

# Add GPG key and repository
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

# Or manual installation
wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add -
echo "deb https://packages.fluentbit.io/ubuntu/$(lsb_release -cs) $(lsb_release -cs) main" | \
  sudo tee /etc/apt/sources.list.d/fluent-bit.list

sudo apt-get update
sudo apt-get install -y fluent-bit

# Start service
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit

CentOS/RHEL (yum)

cat > /etc/yum.repos.d/fluent-bit.repo << 'EOF'
[fluent-bit]
name=Fluent Bit
baseurl=https://packages.fluentbit.io/centos/$releasever/
gpgcheck=1
gpgkey=https://packages.fluentbit.io/fluentbit.key
enabled=1
EOF

sudo yum install -y fluent-bit
sudo systemctl start fluent-bit
sudo systemctl enable fluent-bit

3.2 Docker

# Run latest version
docker run -ti cr.fluentbit.io/fluent/fluent-bit:latest

# Mount configuration file
docker run -ti \
  -v /path/to/fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml \
  -v /var/log:/var/log \
  cr.fluentbit.io/fluent/fluent-bit:latest \
  /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.yaml

# Docker Compose example
cat > docker-compose.yaml << 'EOF'
version: '3.8'
services:
  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit:latest
    volumes:
      - ./fluent-bit.yaml:/fluent-bit/etc/fluent-bit.yaml
      - /var/log:/var/log:ro
    ports:
      - "2020:2020"   # HTTP monitoring
      - "24224:24224"  # Forward protocol
EOF

3.3 macOS (Homebrew)

brew install fluent-bit

# Run
fluent-bit -c /opt/homebrew/etc/fluent-bit/fluent-bit.conf

# Or run with YAML configuration file
fluent-bit -c /path/to/fluent-bit.yaml

3.4 Kubernetes (Helm Chart)

# Add Helm repository
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

# Basic installation
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace

# Install with custom values.yaml
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  -f custom-values.yaml

# Upgrade
helm upgrade fluent-bit fluent/fluent-bit \
  --namespace logging \
  -f custom-values.yaml

3.5 Direct Binary Installation

# Download binary from GitHub Releases
FLUENT_BIT_VERSION=3.2.2
wget https://github.com/fluent/fluent-bit/releases/download/v${FLUENT_BIT_VERSION}/fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz

tar xzf fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64.tar.gz
cd fluent-bit-${FLUENT_BIT_VERSION}-linux-x86_64/

# Run
./bin/fluent-bit -c conf/fluent-bit.yaml

# Check version
./bin/fluent-bit --version

4. Configuration File Structure

Fluent Bit supports two configuration formats: Classic mode (INI format) and YAML mode. YAML has been the standard configuration format since v3.2, and Classic mode is scheduled for deprecation by the end of 2025.

4.1 Classic Mode (fluent-bit.conf)

# fluent-bit.conf - Classic INI format

[SERVICE]
    Flush         5
    Daemon        Off
    Log_Level     info
    Parsers_File  parsers.conf
    HTTP_Server   On
    HTTP_Listen   0.0.0.0
    HTTP_Port     2020
    Hot_Reload    On

[INPUT]
    Name          tail
    Path          /var/log/containers/*.log
    Parser        cri
    Tag           kube.*
    Mem_Buf_Limit 5MB
    Skip_Long_Lines On
    Refresh_Interval 10
    DB            /var/log/flb_kube.db

[FILTER]
    Name          kubernetes
    Match         kube.*
    Kube_URL      https://kubernetes.default.svc:443
    Kube_CA_File  /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log     On
    K8S-Logging.Parser On
    K8S-Logging.Exclude On

[OUTPUT]
    Name          es
    Match         kube.*
    Host          elasticsearch.logging.svc.cluster.local
    Port          9200
    Logstash_Format On
    Logstash_Prefix kube
    Retry_Limit   False

4.2 YAML Mode (fluent-bit.yaml)

# fluent-bit.yaml - YAML format (v3.2+ standard)
service:
  flush: 5
  daemon: off
  log_level: info
  parsers_file: parsers.conf
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  hot_reload: on

pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      parser: cri
      tag: kube.*
      mem_buf_limit: 5MB
      skip_long_lines: on
      refresh_interval: 10
      db: /var/log/flb_kube.db

  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on
      k8s-logging.parser: on
      k8s-logging.exclude: on

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch.logging.svc.cluster.local
      port: 9200
      logstash_format: on
      logstash_prefix: kube
      retry_limit: false

4.3 Comparing the Two Formats

Category	Classic (INI)	YAML
File Extension	.conf	.yaml / .yml
Status	Deprecation planned (end 2025)	Standard (v3.2+)
Processor Support	Not supported	Supported
Readability	Moderate	Excellent
Nested Structures	Limited	Fully supported
Arrays/Lists	Not supported	Supported
Comments	#	#

4.4 Using Environment Variables

Both formats can reference environment variables using ${ENV_VAR} syntax.

# YAML environment variable example
pipeline:
  outputs:
    - name: es
      match: '*'
      host: ${ELASTICSEARCH_HOST}
      port: ${ELASTICSEARCH_PORT}
      http_user: ${ES_USER}
      http_passwd: ${ES_PASSWORD}
      tls: ${ES_TLS_ENABLED}

# Classic environment variable example
[OUTPUT]
    Name          es
    Match         *
    Host          ${ELASTICSEARCH_HOST}
    Port          ${ELASTICSEARCH_PORT}
    HTTP_User     ${ES_USER}
    HTTP_Passwd   ${ES_PASSWORD}

From v4.0, the file:// prefix can be used to securely reference secret values from the filesystem.

pipeline:
  outputs:
    - name: es
      http_passwd: file:///run/secrets/es-password

4.5 @INCLUDE Directive

Configuration files can be modularized for management.

# fluent-bit.conf (Classic)
[SERVICE]
    Flush 5

@INCLUDE inputs.conf
@INCLUDE filters.conf
@INCLUDE outputs.conf

In YAML, use the includes section.

# fluent-bit.yaml
includes:
  - inputs.yaml
  - filters.yaml
  - outputs.yaml

service:
  flush: 5

5. Input Plugin Details

Input plugins are the starting point for data collection. Each Input is assigned a unique Tag that serves as the matching criterion for subsequent Filters and Outputs.

5.1 tail: File Log Collection

The most commonly used Input plugin, reading new lines in real time from the end of a file like tail -f.

pipeline:
  inputs:
    - name: tail
      tag: app.logs
      path: /var/log/app/*.log
      path_key: filename # Include file path in records
      exclude_path: /var/log/app/debug.log # Exclude specific files
      parser: json # Default parser
      db: /var/log/flb_app.db # Offset storage DB (resume after restart)
      db.sync: normal # DB sync mode
      refresh_interval: 10 # File list refresh interval (seconds)
      read_from_head: false # true: read from beginning of file
      skip_long_lines: on # Skip very long lines
      mem_buf_limit: 5MB # Memory buffer limit
      rotate_wait: 5 # Rotation wait time (seconds)
      multiline.parser: docker, cri # Multiline parser

Key Configuration Options

Setting	Default	Description
`Path`	(required)	Log file path (supports wildcards)
`Path_Key`	-	Add file path as a key to records
`Exclude_Path`	-	File paths to exclude
`DB`	-	SQLite DB path for file offset storage
`Refresh_Interval`	60	File list refresh interval (seconds)
`Read_from_Head`	false	Whether to read from file beginning
`Skip_Long_Lines`	Off	Skip lines exceeding Buffer_Max_Size
`Mem_Buf_Limit`	-	Memory buffer limit
`Rotate_Wait`	5	Wait time after log rotation

5.2 systemd: systemd Journal Collection

pipeline:
  inputs:
    - name: systemd
      tag: host.systemd
      systemd_filter: _SYSTEMD_UNIT=docker.service
      systemd_filter: _SYSTEMD_UNIT=kubelet.service
      read_from_tail: on
      strip_underscores: on
      db: /var/log/flb_systemd.db

5.3 forward: Fluentd Protocol Reception

pipeline:
  inputs:
    - name: forward
      tag: forward.incoming
      listen: 0.0.0.0
      port: 24224
      buffer_chunk_size: 1M
      buffer_max_size: 6M

5.4 http / tcp / udp: Network Reception

pipeline:
  inputs:
    # HTTP reception
    - name: http
      tag: http.logs
      listen: 0.0.0.0
      port: 9880
      successful_response_code: 201

    # TCP reception
    - name: tcp
      tag: tcp.logs
      listen: 0.0.0.0
      port: 5170
      format: json

    # UDP reception
    - name: udp
      tag: udp.logs
      listen: 0.0.0.0
      port: 5170
      format: json

5.5 kubernetes_events: K8s Event Collection

pipeline:
  inputs:
    - name: kubernetes_events
      tag: kube.events
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      interval_sec: 1
      retention_time: 1h

5.6 node_exporter_metrics / prometheus_scrape

pipeline:
  inputs:
    # Node metrics collection
    - name: node_exporter_metrics
      tag: node.metrics
      scrape_interval: 30

    # Prometheus endpoint scraping
    - name: prometheus_scrape
      tag: prom.metrics
      host: 127.0.0.1
      port: 9090
      metrics_path: /metrics
      scrape_interval: 10s

5.7 fluentbit_metrics: Internal Metrics

pipeline:
  inputs:
    - name: fluentbit_metrics
      tag: fb.metrics
      scrape_interval: 30
      scrape_on_start: true

6. Parser Configuration

Parsers are core components that convert unstructured text logs into structured data. They are defined in a separate parsers.conf or YAML file.

6.1 Built-in Parsers

Fluent Bit provides built-in parsers for commonly used log formats.

Parser	Format	Use Case
`json`	JSON	JSON format logs
`docker`	JSON (Docker specific)	Docker container logs
`cri`	Regex	CRI (containerd) logs
`syslog-rfc5424`	Regex	RFC 5424 Syslog
`syslog-rfc3164`	Regex	RFC 3164 Syslog
`apache`	Regex	Apache access logs
`nginx`	Regex	Nginx access logs
`logfmt`	Logfmt	key=value pair format

6.2 Writing Custom Regex Parsers

# parsers.conf

# Nginx error log parser
[PARSER]
    Name        nginx_error
    Format      regex
    Regex       ^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$
    Time_Key    time
    Time_Format %Y/%m/%d %H:%M:%S

# Spring Boot log parser
[PARSER]
    Name        spring_boot
    Format      regex
    Regex       ^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L

# Apache Combined log parser
[PARSER]
    Name        apache_combined
    Format      regex
    Regex       ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key    time
    Time_Format %d/%b/%Y:%H:%M:%S %z

The same can be defined in YAML format.

# parsers.yaml
parsers:
  - name: nginx_error
    format: regex
    regex: '^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<message>.*)$'
    time_key: time
    time_format: '%Y/%m/%d %H:%M:%S'

  - name: spring_boot
    format: regex
    regex: '^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+)\s+(?<level>\w+)\s+(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$'
    time_key: time
    time_format: '%Y-%m-%dT%H:%M:%S.%L'

6.3 Multiline Parsers

Merges logs spanning multiple lines, such as Java Stack Traces, into a single record.

# Multiline parser definition
[MULTILINE_PARSER]
    Name          java_stacktrace
    Type          regex
    Flush_Timeout 1000
    # First line pattern: starts with timestamp
    Rule          "start_state"  "/^\d{4}-\d{2}-\d{2}/"  "cont"
    # Continuation line pattern: starts with whitespace or Caused by
    Rule          "cont"         "/^\s+|^Caused by:/"      "cont"

[MULTILINE_PARSER]
    Name          python_traceback
    Type          regex
    Flush_Timeout 1000
    Rule          "start_state"  "/^Traceback/"            "python_tb"
    Rule          "python_tb"    "/^\s+/"                   "python_tb"
    Rule          "python_tb"    "/^\w+Error/"              "end"

How to apply a multiline parser in Input:

pipeline:
  inputs:
    - name: tail
      tag: app.java
      path: /var/log/app/application.log
      multiline.parser: java_stacktrace
      read_from_head: true

6.4 Time_Key and Time_Format

Extracts the timestamp from log messages and uses it as the record time.

[PARSER]
    Name        custom_time
    Format      regex
    Regex       ^(?<time>[^ ]+) (?<message>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%LZ
    Time_Keep   On    # Keep the time field after conversion
    Time_Offset +0900  # KST timezone

6.5 Parser Testing and Debugging

The easiest way to verify parser behavior is to use the stdout Output.

# Configuration for parser testing
service:
  flush: 1
  log_level: debug

pipeline:
  inputs:
    - name: tail
      tag: test
      path: /tmp/test.log
      parser: my_custom_parser

  outputs:
    - name: stdout
      match: test
      format: json_lines

# Generate test log and verify
echo '2026-03-01T12:00:00.000Z ERROR [main] App - Connection failed' >> /tmp/test.log
fluent-bit -c test.yaml

7. Filter Plugin Details

Filter plugins are the intermediate stage for processing collected data. Multiple Filters can be chained in order to compose complex transformation pipelines.

7.1 kubernetes: Pod Metadata Enrichment

The most important filter in Kubernetes environments, automatically adding metadata such as Pod name, Namespace, Labels, and Annotations to container logs.

pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on # Merge JSON logs to top level
      merge_log_key: log_parsed # Merge key name
      keep_log: off # Remove original log field
      k8s-logging.parser: on # Use parser settings from Pod annotations
      k8s-logging.exclude: on # Exclude logs via Pod annotations
      labels: on # Include Pod Labels
      annotations: off # Whether to include Pod Annotations
      buffer_size: 0 # API response buffer (0=unlimited)
      kube_meta_cache_ttl: 300 # Metadata cache TTL (seconds)
      use_kubelet: false # Whether to use kubelet API

Example record structure after enrichment:

{
  "log": "Connection established to database",
  "kubernetes": {
    "pod_name": "api-server-7d4b8c6f5-x2k9j",
    "namespace_name": "production",
    "pod_id": "abc-123-def",
    "container_name": "api-server",
    "container_image": "myregistry/api-server:v2.1.0",
    "labels": {
      "app": "api-server",
      "version": "v2.1.0",
      "team": "backend"
    },
    "host": "node-01"
  }
}

7.2 modify: Add/Remove/Rename Fields

pipeline:
  filters:
    - name: modify
      match: "*"
      # Add fields
      add: environment production
      add: cluster_name main-cluster
      # Rename fields
      rename: log message
      rename: stream source
      # Remove fields
      remove: unwanted_field
      # Conditional field addition (only when key doesn't exist)
      set: default_level INFO
      # Hard copy
      copy: source source_backup

7.3 grep: Regex-Based Filtering

Passes only records matching specific patterns or excludes them.

pipeline:
  filters:
    # Pass only ERROR or WARN levels
    - name: grep
      match: app.*
      regex: level (ERROR|WARN)

    # Exclude healthcheck paths
    - name: grep
      match: access.*
      exclude: path /health

    # Pass only specific namespaces
    - name: grep
      match: kube.*
      regex: $kubernetes['namespace_name'] ^(production|staging)$

    # Combine multiple conditions (AND)
    - name: grep
      match: app.*
      regex: level ERROR
      regex: message .*timeout.*

7.4 record_modifier: Record Modification

pipeline:
  filters:
    - name: record_modifier
      match: "*"
      record: hostname ${HOSTNAME}
      record: service_name my-application
      remove_key: unnecessary_field
      allowlist_key: timestamp
      allowlist_key: level
      allowlist_key: message

7.5 nest: Nested Structure Conversion

Converts flat structures to nested structures or vice versa.

pipeline:
  filters:
    # Nest: flat -> nested
    - name: nest
      match: '*'
      operation: nest
      wildcard: 'app_*'
      nest_under: application
      # app_name, app_version -> application: { name, version }

    # Lift: nested -> flat
    - name: nest
      match: '*'
      operation: lift
      nested_under: kubernetes
      # kubernetes: { pod_name, namespace } -> pod_name, namespace

7.6 lua: Custom Transformation with Lua Scripts

The most flexible transformation method, capable of implementing complex logic with Lua scripts.

pipeline:
  filters:
    - name: lua
      match: app.*
      script: /fluent-bit/scripts/transform.lua
      call: process_log

-- /fluent-bit/scripts/transform.lua

function process_log(tag, timestamp, record)
    -- Normalize log level
    if record["level"] then
        record["level"] = string.upper(record["level"])
    end

    -- Mask sensitive information
    if record["message"] then
        record["message"] = string.gsub(
            record["message"],
            "%d%d%d%d%-%d%d%d%d%-%d%d%d%d%-%d%d%d%d",
            "****-****-****-****"
        )
    end

    -- Add grade based on response time
    if record["response_time"] then
        local rt = tonumber(record["response_time"])
        if rt > 5000 then
            record["performance"] = "critical"
        elseif rt > 1000 then
            record["performance"] = "slow"
        else
            record["performance"] = "normal"
        end
    end

    -- Add timestamp field
    record["processed_at"] = os.date("!%Y-%m-%dT%H:%M:%SZ")

    -- 2 = MODIFIED
    return 2, timestamp, record
end

Lua callback function return values:

Code	Meaning
-1	Drop record
0	Keep original
1	Timestamp only modified
2	Record modified

7.7 rewrite_tag: Tag-Based Routing Changes

Dynamically changes Tags based on record content to route to different Outputs.

pipeline:
  filters:
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
      rule: $kubernetes['namespace_name'] ^(staging)$    stg.$TAG  false
      rule: $level                        ^(ERROR)$      alert.$TAG false

Rule syntax: rule: $KEY REGEX NEW_TAG KEEP_ORIGINAL

$KEY: Field to match
REGEX: Regex pattern
NEW_TAG: New Tag
KEEP_ORIGINAL: Whether to keep the original (true/false)

7.8 throttle: Log Rate Limiting

Protects the system by limiting excessive log generation.

pipeline:
  filters:
    - name: throttle
      match: app.*
      rate: 1000 # Allowed per window
      window: 5 # Window size (seconds)
      interval: 1s # Evaluation interval
      print_status: true # Print status

7.9 multiline: Multiline Log Merging

Merges multiline logs at the Filter stage (separate from Input multiline.parser).

pipeline:
  filters:
    - name: multiline
      match: app.*
      multiline.parser: java_stacktrace
      multiline.key_content: log

8. Output Plugin Details

Output plugins transmit processed data to final destinations. A single Fluent Bit instance can use multiple Outputs simultaneously.

8.1 elasticsearch / opensearch

pipeline:
  outputs:
    # Elasticsearch
    - name: es
      match: kube.*
      host: ${ES_HOST}
      port: 9200
      index: logs
      type: _doc
      http_user: ${ES_USER}
      http_passwd: ${ES_PASSWORD}
      logstash_format: on
      logstash_prefix: kube-logs
      logstash_dateformat: %Y.%m.%d
      time_key: '@timestamp'
      include_tag_key: true
      tag_key: fluentbit_tag
      generate_id: on # Generate deduplication ID
      buffer_size: 512KB
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem
      retry_limit: 5
      workers: 2
      suppress_type_name: on # ES 8.x compatibility

    # OpenSearch
    - name: opensearch
      match: app.*
      host: opensearch.logging.svc
      port: 9200
      index: app-logs
      http_user: admin
      http_passwd: ${OPENSEARCH_PASSWORD}
      tls: on
      suppress_type_name: on
      trace_output: off

Key Configuration Options

Setting	Default	Description
`Host`	127.0.0.1	Elasticsearch host
`Port`	9200	Port number
`Index`	fluent-bit	Index name
`Logstash_Format`	Off	Use date-based indexing
`Logstash_Prefix`	logstash	Index prefix
`HTTP_User`	-	Basic Auth username
`HTTP_Passwd`	-	Basic Auth password
`TLS`	Off	Enable TLS
`Generate_ID`	Off	Auto-generate document ID
`Workers`	0	Number of parallel Workers
`Retry_Limit`	1	Retry count (False=unlimited)
`Suppress_Type_Name`	Off	Disable ES 8.x _type

8.2 loki: Grafana Loki Integration

pipeline:
  outputs:
    - name: loki
      match: kube.*
      host: loki-gateway.logging.svc
      port: 3100
      uri: /loki/api/v1/push
      tenant_id: my-tenant
      labels: job=fluent-bit
      label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
      label_map_path: /fluent-bit/etc/loki-labelmap.json
      remove_keys: kubernetes,stream
      auto_kubernetes_labels: on
      line_format: json
      drop_single_key: on
      http_user: ${LOKI_USER}
      http_passwd: ${LOKI_PASSWORD}
      tls: on
      tls.verify: on
      workers: 2

Loki Label Map File Example

{
  "kubernetes": {
    "namespace_name": "namespace",
    "pod_name": "pod",
    "container_name": "container",
    "labels": {
      "app": "app"
    }
  },
  "stream": "stream"
}

8.3 s3: AWS S3 Storage

pipeline:
  outputs:
    - name: s3
      match: archive.*
      region: ap-northeast-2
      bucket: my-log-bucket
      s3_key_format: /logs/$TAG/%Y/%m/%d/%H/$UUID.gz
      s3_key_format_tag_delimiters: .
      total_file_size: 50M # Upload when this size is reached
      upload_timeout: 10m # Upload after this time even if size not met
      use_put_object: on
      compression: gzip
      content_type: application/gzip
      store_dir: /tmp/fluent-bit-s3 # Local buffer directory
      store_dir_limit_size: 512M
      retry_limit: 5
      # When using IAM roles (IRSA / EKS Pod Identity)
      role_arn: arn:aws:iam::123456789012:role/fluent-bit-s3
      # Static credentials (not recommended)
      # endpoint:
      # sts_endpoint:

8.4 kafka: Kafka Integration

pipeline:
  outputs:
    - name: kafka
      match: app.*
      brokers: kafka-0:9092,kafka-1:9092,kafka-2:9092
      topics: application-logs
      timestamp_key: '@timestamp'
      timestamp_format: iso8601
      format: json
      message_key: log
      queue_full_retries: 10
      rdkafka.request.required.acks: 1
      rdkafka.log.connection.close: false
      rdkafka.compression.codec: snappy

8.5 cloudwatch_logs: AWS CloudWatch

pipeline:
  outputs:
    - name: cloudwatch_logs
      match: kube.*
      region: ap-northeast-2
      log_group_name: /eks/my-cluster/containers
      log_group_template: /eks/my-cluster/$kubernetes['namespace_name']
      log_stream_prefix: from-fluent-bit-
      log_stream_template: $kubernetes['pod_name']
      auto_create_group: on
      extra_user_agent: fluent-bit
      retry_limit: 5
      workers: 1

8.6 stdout: Standard Output for Debugging

pipeline:
  outputs:
    - name: stdout
      match: '*'
      format: json_lines # json_lines | msgpack

8.7 forward: Forward to Fluentd

pipeline:
  outputs:
    - name: forward
      match: '*'
      host: fluentd-aggregator.logging.svc
      port: 24224
      time_as_integer: off
      send_options: true
      require_ack_response: true
      # TLS
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem
      # Shared key authentication
      shared_key: my-shared-secret
      self_hostname: fluent-bit-node01

8.8 prometheus_exporter: Metrics Exposure

pipeline:
  outputs:
    - name: prometheus_exporter
      match: metrics.*
      host: 0.0.0.0
      port: 2021
      add_label: app fluent-bit

9. Buffer and Backpressure Management

In large-scale environments, log generation rates may exceed transmission rates. Properly configuring Fluent Bit Buffer and Backpressure management ensures stable operation without data loss.

9.1 Memory Buffer vs Filesystem Buffer

Category	Memory Buffer	Filesystem Buffer
Storage Location	RAM	Disk + RAM (hybrid)
Speed	Very fast	Relatively slower
Data Safety	Lost on process exit	Preserved on disk
Capacity Limit	Limited by RAM size	Expandable to disk size
Config Complexity	Simple	Additional config required
Best For	Development/testing	Production

9.2 Service-Level Configuration

service:
  flush: 1
  log_level: info
  # Enable filesystem buffer
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal # normal | full
  storage.checksum: off # Data integrity verification
  storage.max_chunks_up: 128 # Max chunks to keep in memory
  storage.backlog.mem_limit: 5M # Backlog memory limit
  storage.metrics: on # Enable storage metrics

9.3 Input-Level Buffer Configuration

pipeline:
  inputs:
    # Memory buffer (default)
    - name: tail
      tag: app.mem
      path: /var/log/app/*.log
      mem_buf_limit: 10MB # Memory buffer limit

    # Filesystem buffer
    - name: tail
      tag: app.fs
      path: /var/log/app/*.log
      storage.type: filesystem # filesystem | memory
      storage.pause_on_chunks_overlimit: off # Continue writing to disk when limit exceeded

9.4 Output-Level Buffer Configuration

pipeline:
  outputs:
    - name: es
      match: '*'
      host: elasticsearch
      port: 9200
      storage.total_limit_size: 1G # Per-Output filesystem buffer total limit
      retry_limit: 10 # Retry count
      # retry_limit: false          # Unlimited retries

9.5 Backpressure Mechanism

Backpressure occurs when the Output cannot transmit data fast enough. Fluent Bit handles Backpressure in the following stages:

  [Input: Data Collection]
         |
         v
  [Store in Memory Buffer]
         |
    Memory limit reached?
    /              \
   No              Yes
   |                |
   v                v
 [Continue]   storage.type=filesystem?
                /              \
              No               Yes
              |                 |
              v                 v
       [Pause Input]       [Write to Disk]
       (Risk of data loss)      |
                          Disk limit reached?
                          /              \
                        No              Yes
                        |                |
                        v                v
                   [Continue]     [Pause Input]

Recommended Backpressure Configuration (Production)

service:
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128
  storage.backlog.mem_limit: 10M

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      storage.type: filesystem
      storage.pause_on_chunks_overlimit: off

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch
      storage.total_limit_size: 5G
      retry_limit: false
      net.keepalive: on
      net.keepalive_idle_timeout: 30

9.6 Mem_Buf_Limit vs storage.max_chunks_up

Setting	Scope	Buffer Mode	Behavior
`Mem_Buf_Limit`	Per Input	memory only	Pauses Input when limit exceeded
`storage.max_chunks_up`	Global (SERVICE)	filesystem	Limits chunks in memory
`storage.total_limit_size`	Per Output	filesystem	Limits total disk buffer

Note: When using storage.type: filesystem, Mem_Buf_Limit has no effect. Instead, storage.max_chunks_up controls the number of chunks in memory.

10. Kubernetes Integration Complete Guide

10.1 DaemonSet Deployment Architecture

+---------------------------------------------------------------------+
|                      Kubernetes Cluster                             |
|                                                                     |
|  +-------------------+  +-------------------+  +-------------------+|
|  |     Node 1        |  |     Node 2        |  |     Node 3        ||
|  |                    |  |                    |  |                    ||
|  | +------+ +------+ |  | +------+ +------+ |  | +------+ +------+ ||
|  | |Pod A | |Pod B | |  | |Pod C | |Pod D | |  | |Pod E | |Pod F | ||
|  | +--+---+ +--+---+ |  | +--+---+ +--+---+ |  | +--+---+ +--+---+ ||
|  |    |        |      |  |    |        |      |  |    |        |      ||
|  |    v        v      |  |    v        v      |  |    v        v      ||
|  | /var/log/containers|  | /var/log/containers|  | /var/log/containers||
|  |         |          |  |         |          |  |         |          ||
|  |    +----+-----+    |  |    +----+-----+    |  |    +----+-----+    ||
|  |    |Fluent Bit|    |  |    |Fluent Bit|    |  |    |Fluent Bit|    ||
|  |    |(DaemonSet)|   |  |    |(DaemonSet)|   |  |    |(DaemonSet)|   ||
|  |    +----+-----+    |  |    +----+-----+    |  |    +----+-----+    ||
|  +---------|----------+  +---------|----------+  +---------|----------+|
|            |                       |                       |           |
|            +----------+------------+----------+------------+           |
|                       |                       |                        |
|               +-------v--------+     +--------v--------+              |
|               | Elasticsearch  |     |  Grafana Loki   |              |
|               +----------------+     +-----------------+              |
+---------------------------------------------------------------------+

10.2 Helm Chart Installation

# Add repository
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update

# Basic installation
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace

# Verify installation
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl get ds -n logging

10.3 values.yaml Customization

# custom-values.yaml
kind: DaemonSet

image:
  repository: cr.fluentbit.io/fluent/fluent-bit
  tag: '3.2.2'
  pullPolicy: IfNotPresent

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

tolerations:
  - operator: Exists # Deploy on all nodes (including master)

serviceAccount:
  create: true
  annotations:
    # IRSA (EKS)
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/fluent-bit

# Volume mounts
volumeMounts:
  - name: varlog
    mountPath: /var/log
  - name: varlibdockercontainers
    mountPath: /var/lib/docker/containers
    readOnly: true
  - name: etcmachineid
    mountPath: /etc/machine-id
    readOnly: true

volumes:
  - name: varlog
    hostPath:
      path: /var/log
  - name: varlibdockercontainers
    hostPath:
      path: /var/lib/docker/containers
  - name: etcmachineid
    hostPath:
      path: /etc/machine-id

# Environment variables
env:
  - name: ELASTICSEARCH_HOST
    value: 'elasticsearch-master.logging.svc.cluster.local'
  - name: ELASTICSEARCH_PORT
    value: '9200'

# Fluent Bit configuration
config:
  service: |
    [SERVICE]
        Flush         5
        Log_Level     info
        Daemon        off
        Parsers_File  /fluent-bit/etc/parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
        storage.path  /var/log/fluent-bit/buffer/
        storage.sync  normal
        storage.max_chunks_up 128
        Hot_Reload    On

  inputs: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        multiline.parser  docker, cri
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10
        storage.type      filesystem

  filters: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Labels              On
        Annotations         Off

  outputs: |
    [OUTPUT]
        Name            es
        Match           kube.*
        Host            ${ELASTICSEARCH_HOST}
        Port            ${ELASTICSEARCH_PORT}
        Logstash_Format On
        Logstash_Prefix kube
        Retry_Limit     False
        Suppress_Type_Name On

# ServiceMonitor for Prometheus metrics collection
serviceMonitor:
  enabled: true
  interval: 30s
  scrapeTimeout: 10s

# Liveness/Readiness Probe
livenessProbe:
  httpGet:
    path: /
    port: http
readinessProbe:
  httpGet:
    path: /api/v1/health
    port: http

10.4 Container Log Paths and CRI Parsing

In Kubernetes, container logs are stored at the following paths:

/var/log/containers/<pod-name>_<namespace>_<container-name>-<container-id>.log
  -> symlink -> /var/log/pods/<namespace>_<pod-name>_<pod-uid>/<container-name>/0.log

CRI Log Format (containerd / CRI-O)

2026-03-01T12:00:00.123456789Z stdout F This is a log message
2026-03-01T12:00:00.123456789Z stderr P This is a partial log line

Field	Description
`2026-03-01T12:00:00.123456789Z`	RFC 3339 nanosecond timestamp
`stdout` / `stderr`	Stream type
`F` / `P`	Full / Partial log flag
Remainder	Log message body

Fluent Bit automatically detects and parses Docker format and CRI format when multiline.parser: docker, cri is specified.

10.5 Per-Namespace Log Routing

# Route to different Outputs based on Namespace
pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # Change tags for production namespace
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$ prod.$TAG false
      rule: $kubernetes['namespace_name'] ^(staging)$    stg.$TAG  false
      rule: $kubernetes['namespace_name'] ^(monitoring)$ mon.$TAG  false

  outputs:
    # Production logs -> Elasticsearch
    - name: es
      match: prod.*
      host: es-production.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-logs

    # Staging logs -> Loki
    - name: loki
      match: stg.*
      host: loki.logging.svc
      port: 3100
      labels:
        env=staging

    # Monitoring logs -> S3 archiving
    - name: s3
      match: mon.*
      bucket: monitoring-logs-archive
      region: ap-northeast-2
      total_file_size: 100M
      upload_timeout: 10m
      compression: gzip

    # Other logs -> Default Elasticsearch
    - name: es
      match: kube.*
      host: es-default.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: default-logs

10.6 Multi-Tenant Log Separation

pipeline:
  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # Tag rewriting based on team Label
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['labels']['team'] ^(backend)$  team.backend.$TAG  false
      rule: $kubernetes['labels']['team'] ^(frontend)$ team.frontend.$TAG false
      rule: $kubernetes['labels']['team'] ^(data)$     team.data.$TAG     false

  outputs:
    # Backend team -> Dedicated Elasticsearch index
    - name: es
      match: team.backend.*
      host: elasticsearch.logging.svc
      logstash_format: on
      logstash_prefix: team-backend

    # Frontend team -> Dedicated Loki tenant
    - name: loki
      match: team.frontend.*
      host: loki.logging.svc
      tenant_id: frontend-team
      labels:
        team=frontend

    # Data team -> S3 + Elasticsearch dual
    - name: es
      match: team.data.*
      host: elasticsearch.logging.svc
      logstash_format: on
      logstash_prefix: team-data

    - name: s3
      match: team.data.*
      bucket: data-team-logs
      region: ap-northeast-2
      compression: gzip

10.7 ServiceAccount and RBAC Configuration

# fluent-bit-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit
rules:
  - apiGroups: ['']
    resources:
      - namespaces
      - pods
      - pods/logs
      - nodes
      - nodes/proxy
    verbs: ['get', 'list', 'watch']
  - apiGroups: ['']
    resources:
      - events
    verbs: ['get', 'list', 'watch']
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit
subjects:
  - kind: ServiceAccount
    name: fluent-bit
    namespace: logging

11. Practical Pipeline Examples

11.1 Example 1: K8s Logs to Elasticsearch + Kibana

The most common EFK (Elasticsearch + Fluent Bit + Kibana) stack configuration.

# fluent-bit-efk.yaml
service:
  flush: 5
  log_level: info
  parsers_file: /fluent-bit/etc/parsers.conf
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      mem_buf_limit: 5MB
      skip_long_lines: on
      refresh_interval: 10
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      kube_url: https://kubernetes.default.svc:443
      kube_ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kube_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      merge_log: on
      keep_log: off
      k8s-logging.parser: on
      k8s-logging.exclude: on
      labels: on

    # Exclude kube-system and logging namespace logs
    - name: grep
      match: kube.*
      exclude: $kubernetes['namespace_name'] ^(kube-system|logging)$

    # Add environment info
    - name: modify
      match: kube.*
      add: cluster_name my-eks-cluster
      add: environment production

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch-master.logging.svc.cluster.local
      port: 9200
      http_user: elastic
      http_passwd: ${ES_PASSWORD}
      logstash_format: on
      logstash_prefix: kube-logs
      logstash_dateformat: "%Y.%m.%d"
      time_key: "@timestamp"
      include_tag_key: true
      tag_key: fluentbit_tag
      suppress_type_name: on
      generate_id: on
      retry_limit: false
      workers: 2
      tls: on
      tls.verify: on
      tls.ca_file: /certs/es-ca.pem

11.2 Example 2: K8s Logs to Grafana Loki + Grafana

# fluent-bit-loki.yaml
service:
  flush: 5
  log_level: info
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      mem_buf_limit: 5MB
      skip_long_lines: on

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      keep_log: off
      labels: on
      annotations: off

  outputs:
    - name: loki
      match: kube.*
      host: loki-gateway.logging.svc.cluster.local
      port: 3100
      uri: /loki/api/v1/push
      labels: job=fluent-bit,cluster=my-cluster
      label_keys: $kubernetes['namespace_name'],$kubernetes['pod_name'],$kubernetes['container_name']
      auto_kubernetes_labels: on
      line_format: json
      workers: 2
      retry_limit: false

11.3 Example 3: S3 Archiving + Elasticsearch Dual Output

A pattern that simultaneously sends a single input to two Outputs.

# fluent-bit-dual.yaml
service:
  flush: 5
  log_level: info
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 128

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      keep_log: off
      labels: on

  outputs:
    # Elasticsearch for real-time search/analysis
    - name: es
      match: kube.*
      host: elasticsearch.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: kube-logs
      suppress_type_name: on
      retry_limit: false
      workers: 2

    # S3 archiving for long-term retention
    - name: s3
      match: kube.*
      region: ap-northeast-2
      bucket: log-archive-bucket
      s3_key_format: /kubernetes/$TAG/%Y/%m/%d/%H/$UUID.gz
      s3_key_format_tag_delimiters: .
      total_file_size: 100M
      upload_timeout: 10m
      compression: gzip
      content_type: application/gzip
      store_dir: /tmp/fluent-bit-s3
      store_dir_limit_size: 1G
      use_put_object: on
      retry_limit: 5
      workers: 1

11.4 Example 4: Per-Namespace Filtering and Multi-Destination Routing

# fluent-bit-routing.yaml
service:
  flush: 5
  log_level: info
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      multiline.parser: docker, cri
      db: /var/log/flb_kube.db
      storage.type: filesystem

  filters:
    - name: kubernetes
      match: kube.*
      merge_log: on
      labels: on

    # Per-Namespace tag rewriting
    - name: rewrite_tag
      match: kube.*
      rule: $kubernetes['namespace_name'] ^(production)$  route.prod.$TAG  false
      rule: $kubernetes['namespace_name'] ^(staging)$     route.stg.$TAG   false
      rule: $kubernetes['namespace_name'] ^(kube-system)$ route.sys.$TAG   false

    # Extract only ERRORs from production
    - name: rewrite_tag
      match: route.prod.*
      rule: $log ^.*ERROR.*$ alert.prod.$TAG true

  outputs:
    # All production logs -> Elasticsearch
    - name: es
      match: route.prod.*
      host: es-prod.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-all
      suppress_type_name: on
      workers: 2

    # Production ERROR alerts -> Separate index
    - name: es
      match: alert.prod.*
      host: es-prod.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: prod-alerts
      suppress_type_name: on

    # Staging logs -> Loki (cost-efficient)
    - name: loki
      match: route.stg.*
      host: loki.logging.svc
      port: 3100
      labels:
        env=staging
      auto_kubernetes_labels: on
      line_format: json

    # System logs -> S3 archiving (long-term retention)
    - name: s3
      match: route.sys.*
      bucket: system-logs-archive
      region: ap-northeast-2
      total_file_size: 50M
      upload_timeout: 15m
      compression: gzip
      s3_key_format: /kube-system/%Y/%m/%d/$UUID.gz

    # Unclassified logs -> Default Elasticsearch
    - name: es
      match: kube.*
      host: es-default.logging.svc
      port: 9200
      logstash_format: on
      logstash_prefix: default-logs
      suppress_type_name: on

12. Performance Tuning

12.1 Workers Configuration

Setting the Workers parameter on Output plugins enables parallel transmission. Each Worker operates as an independent thread.

pipeline:
  outputs:
    - name: es
      match: '*'
      host: elasticsearch
      port: 9200
      workers: 4 # 4 parallel Workers
      net.keepalive: on
      net.keepalive_idle_timeout: 30

Workers Configuration Guidelines

Scenario	Recommended Workers	Notes
Low throughput	0-1	Default is sufficient
Medium throughput	2-4	Most production environments
High throughput	4-8	Consider CPU core count
Very high throughput	8 or more	Check network/destination bottleneck

12.2 Flush Interval Optimization

The Flush value in seconds determines how often data is transmitted from the buffer to the Output.

service:
  flush: 1 # Flush every 1 second (prioritize real-time)
  # flush: 5  # Flush every 5 seconds (prioritize throughput)

Flush Value	Characteristics
1 second	Low latency, higher CPU usage
5 seconds	Balanced setting (default recommended)
10+ seconds	Higher throughput, larger batch size

12.3 Buffer Size Adjustment

pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      # High throughput environment
      buffer_chunk_size: 512KB # Chunk unit size (default 32KB)
      buffer_max_size: 5MB # Maximum buffer size (default 32KB)
      mem_buf_limit: 50MB # Memory buffer limit

  outputs:
    - name: es
      match: '*'
      buffer_size: 512KB # HTTP buffer size

12.4 Pipeline Parallelization

In environments requiring high throughput, Input and Output can be separated to configure parallel pipelines.

pipeline:
  inputs:
    # Application log pipeline
    - name: tail
      tag: app.*
      path: /var/log/containers/app-*.log
      multiline.parser: docker, cri
      threaded: on # Run in a separate thread

    # System log pipeline
    - name: tail
      tag: sys.*
      path: /var/log/containers/kube-*.log
      multiline.parser: docker, cri
      threaded: on

  outputs:
    - name: es
      match: app.*
      host: elasticsearch
      workers: 4

    - name: es
      match: sys.*
      host: elasticsearch
      workers: 2

12.5 Hot Reload

The Hot Reload feature, supported since v2.1, allows configuration reloading without service interruption.

# Enable Hot Reload
service:
  hot_reload: on

There are three ways to trigger a reload:

# Method 1: SIGHUP signal
kill -SIGHUP $(pidof fluent-bit)

# Method 2: HTTP API (v4.0+)
curl -X POST http://localhost:2020/api/v2/reload

# Method 3: Command line option
fluent-bit -c fluent-bit.yaml -Y   # --enable-hot-reload

Hot Reload Considerations

Data in the buffer is preserved
For filesystem buffers, data on disk is also maintained
If the new configuration has syntax errors, the reload fails and the existing configuration is retained
SIGHUP is not supported on Windows

12.6 Memory Usage Monitoring

# Check Fluent Bit process memory
ps aux | grep fluent-bit

# Check internal metrics via HTTP API
curl -s http://localhost:2020/api/v1/storage | jq .

# Check via Prometheus metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep fluentbit_input

12.7 Performance Benchmark Reference

The following are reference performance figures for Fluent Bit in typical environments (results vary depending on hardware, network, and configuration).

Scenario	Throughput	CPU Usage	Memory
Simple forwarding (tail input, stdout output)	~100K events/s	5-10% (1 core)	5-10MB
K8s Filter + ES output	~40-60K events/s	15-25% (1 core)	30-50MB
K8s Filter + Lua transform + ES output	~20-40K events/s	25-40% (1 core)	40-80MB
Complex pipeline (multiple Filters + dual output)	~15-30K events/s	30-50% (1 core)	50-100MB

v4.1 SIMD optimization: JSON encoding is processed with SIMD (Single Instruction, Multiple Data) instructions, improving JSON conversion performance by 2.5x.

13. Monitoring and Observability

13.1 Built-in HTTP Monitoring Endpoints

Fluent Bit can monitor its own status through a built-in HTTP server.

service:
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2020
  health_check: on
  hc_errors_count: 5 # Unhealthy when this many errors occur
  hc_retry_failure_count: 5 # Unhealthy when this many retry failures occur
  hc_period: 60 # Health check interval (seconds)

Available Endpoints

Endpoint	Description
`/`	Fluent Bit build information
`/api/v1/health`	Health check (200=OK, 500=Unhealthy)
`/api/v1/metrics`	JSON format metrics
`/api/v1/metrics/prometheus`	Prometheus format metrics
`/api/v2/metrics`	v2 metrics endpoint
`/api/v2/metrics/prometheus`	v2 Prometheus metrics
`/api/v1/storage`	Storage/buffer status
`/api/v2/reload`	Hot Reload trigger (POST)
`/api/v1/uptime`	Uptime

# Health check
curl -s http://localhost:2020/api/v1/health
# Response: ok (HTTP 200) or error (HTTP 500)

# Check storage status
curl -s http://localhost:2020/api/v1/storage | jq .
# {
#   "storage_layer": {
#     "chunks": {
#       "total_chunks": 15,
#       "mem_chunks": 10,
#       "fs_chunks": 5,
#       "fs_chunks_up": 3,
#       "fs_chunks_down": 2
#     }
#   }
# }

# Check Prometheus metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus

13.2 Prometheus Metrics Collection Configuration

A configuration for collecting Fluent Bit internal metrics with Prometheus and visualizing them in Grafana.

# Expose Fluent Bit's own metrics to Prometheus
pipeline:
  inputs:
    - name: fluentbit_metrics
      tag: fb.metrics
      scrape_interval: 30

  outputs:
    - name: prometheus_exporter
      match: fb.metrics
      host: 0.0.0.0
      port: 2021

Kubernetes ServiceMonitor Configuration

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: fluent-bit
  namespace: logging
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: fluent-bit
  endpoints:
    - port: http
      path: /api/v2/metrics/prometheus
      interval: 30s
      scrapeTimeout: 10s
  namespaceSelector:
    matchNames:
      - logging

Key Prometheus Metrics

Metric	Type	Description
`fluentbit_input_records_total`	Counter	Records collected per Input
`fluentbit_input_bytes_total`	Counter	Bytes collected per Input
`fluentbit_output_proc_records_total`	Counter	Records processed per Output
`fluentbit_output_proc_bytes_total`	Counter	Bytes processed per Output
`fluentbit_output_errors_total`	Counter	Errors per Output
`fluentbit_output_retries_total`	Counter	Retries per Output
`fluentbit_output_retries_failed_total`	Counter	Failed retries per Output
`fluentbit_filter_records_total`	Counter	Records processed per Filter
`fluentbit_uptime`	Gauge	Uptime (seconds)
`fluentbit_storage_chunks`	Gauge	Number of storage chunks

13.3 Grafana Dashboards

Key panels to include when building a Fluent Bit monitoring dashboard in Grafana:

Dashboard Panel Configuration

Panel	PromQL Example	Purpose
Input Throughput	`rate(fluentbit_input_records_total[5m])`	Records collected per second
Output Throughput	`rate(fluentbit_output_proc_records_total[5m])`	Records transmitted per second
Output Error Rate	`rate(fluentbit_output_errors_total[5m])`	Errors per second
Retry Rate	`rate(fluentbit_output_retries_total[5m])`	Retry trends
Buffer Usage	`fluentbit_storage_chunks`	Current buffer chunks
Uptime	`fluentbit_uptime`	Process stability
Input/Output Delta	`rate(input[5m]) - rate(output[5m])`	Backpressure detection

Import Grafana Labs community dashboard (ID: 7752) for a quick setup of a basic Fluent Bit monitoring dashboard.

14. Troubleshooting Guide

14.1 When Logs Are Not Being Collected

Checklist

# 1. Check Fluent Bit process status
kubectl get pods -n logging -l app.kubernetes.io/name=fluent-bit
kubectl logs -n logging <fluent-bit-pod> --tail=50

# 2. Validate configuration file syntax
fluent-bit -c fluent-bit.yaml --dry-run

# 3. Verify log file paths
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/containers/

# 4. Check DB file (offset)
kubectl exec -n logging <fluent-bit-pod> -- ls -la /var/log/flb_kube.db

# 5. Check permissions
kubectl exec -n logging <fluent-bit-pod> -- cat /var/log/containers/<target-log>

Common Causes and Solutions

Cause	Symptom	Solution
Wrong file path	Record count 0 at Input	Verify `Path`, check wildcard patterns
Insufficient perms	Permission denied error	Check SecurityContext, hostPath permissions
Corrupt DB file	Offset points to end of file	Delete DB file and restart
Parser mismatch	Parsing failure, empty records	Verify original with `stdout` Output
Tag match failure	Not reaching Filter/Output	Verify `Match` pattern matches `Tag`

14.2 Memory Leak / OOM

# Memory limit configuration
pipeline:
  inputs:
    - name: tail
      path: /var/log/containers/*.log
      mem_buf_limit: 5MB # Input memory limit
      skip_long_lines: on # Skip long lines
      buffer_chunk_size: 32KB # Chunk size limit
      buffer_max_size: 32KB # Maximum buffer

service:
  storage.path: /var/log/fluent-bit/buffer/
  storage.max_chunks_up: 64 # Limit memory chunk count (default 128)

OOM Prevention Checklist

Verify Mem_Buf_Limit or storage.max_chunks_up is configured
Enable Skip_Long_Lines: On
Remove unnecessary Filters (especially when Lua creates large tables)
Check Kubernetes Filter Buffer_Size (0=unlimited)
Set appropriate Kubernetes Resource Limits

14.3 Log Loss Due to Backpressure

# Check for Backpressure
curl -s http://localhost:2020/api/v1/storage | jq .

# Check via metrics
curl -s http://localhost:2020/api/v2/metrics/prometheus | grep -E "retries|errors|dropped"

Response Steps When Backpressure Occurs

Increase Output Workers count for parallel transmission
Reduce Flush interval for more frequent transmission (e.g., 5 to 1)
Switch to storage.type: filesystem for disk buffer utilization
Set storage.total_limit_size sufficiently
Check and scale up destination (Elasticsearch, etc.) processing capacity

# Backpressure response configuration example
service:
  flush: 1
  storage.path: /var/log/fluent-bit/buffer/
  storage.sync: normal
  storage.max_chunks_up: 256

pipeline:
  inputs:
    - name: tail
      tag: kube.*
      path: /var/log/containers/*.log
      storage.type: filesystem
      storage.pause_on_chunks_overlimit: off

  outputs:
    - name: es
      match: kube.*
      host: elasticsearch
      port: 9200
      workers: 4
      storage.total_limit_size: 10G
      retry_limit: false
      net.keepalive: on
      net.keepalive_idle_timeout: 15

14.4 TLS/Authentication Errors

# TLS connection test
kubectl exec -n logging <fluent-bit-pod> -- \
  curl -v --cacert /certs/ca.pem https://elasticsearch:9200

# Check certificate expiration
kubectl exec -n logging <fluent-bit-pod> -- \
  openssl x509 -in /certs/ca.pem -noout -enddate

Common TLS Errors and Solutions

Error	Cause	Solution
`SSL_ERROR_SYSCALL`	Certificate path error	Verify `tls.ca_file` path
`certificate verify failed`	CA certificate mismatch	Use correct CA certificate
`certificate has expired`	Certificate expired	Renew certificate
`connection refused`	Port/host error	Verify Host, Port, TLS port
`401 Unauthorized`	Authentication failure	Verify `http_user`, `http_passwd`

14.5 Debugging Methods

Step 1: Set Log Level to debug

service:
  log_level: debug # error, warn, info, debug, trace

Step 2: Add stdout Output

pipeline:
  outputs:
    # For debugging: all records to standard output
    - name: stdout
      match: '*'
      format: json_lines

    # Actual Output
    - name: es
      match: '*'
      host: elasticsearch

Step 3: Stage-by-Stage Pipeline Verification

# Test Input only
fluent-bit -i tail -p path=/var/log/test.log -o stdout

# Test Parser
fluent-bit -i tail -p path=/var/log/test.log -p parser=json -o stdout

# Test full configuration (dry-run)
fluent-bit -c fluent-bit.yaml --dry-run

Step 4: Real-Time Log Viewing in Kubernetes

# Stream Fluent Bit logs
kubectl logs -n logging -l app.kubernetes.io/name=fluent-bit -f --tail=100

# Specific Pod logs
kubectl logs -n logging fluent-bit-xxxxx -f

# Previous container logs (on crash)
kubectl logs -n logging fluent-bit-xxxxx --previous

15. Operational Best Practices

15.1 Production Checklist

Item	Recommended Setting	Reason
`storage.type`	filesystem	Prevent data loss
`storage.path`	Separate volume mount	Isolate disk I/O
`storage.total_limit_size`	50-70% of free disk space	Prevent disk full
`Retry_Limit`	false (unlimited) or sufficient	Recover from transient failures
`Workers`	2-4	Parallel transmission performance
`Hot_Reload`	on	Zero-downtime config changes
`HTTP_Server`	on	Enable monitoring
`health_check`	on	Kubernetes Probe integration
`Skip_Long_Lines`	on	Prevent failures from abnormal logs
Resource Limits	Appropriate CPU/Memory	Prevent OOM

15.2 Security Recommendations

# Kubernetes Pod Security
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  capabilities:
    drop: ['ALL']

# Mount only required volumes
volumes:
  - name: varlog
    hostPath:
      path: /var/log
      type: ''
  - name: buffer
    emptyDir:
      sizeLimit: 2Gi

# Enforce TLS (Output)
pipeline:
  outputs:
    - name: es
      tls: on
      tls.verify: on
      tls.ca_file: /certs/ca.pem

15.3 Log Rotation Handling

Fluent Bit's tail Input automatically detects log rotation. However, the following settings should be verified:

pipeline:
  inputs:
    - name: tail
      path: /var/log/app/*.log
      db: /var/log/flb_app.db # Required: offset tracking
      rotate_wait: 5 # Wait time after rotation (seconds)
      refresh_interval: 10 # File list refresh interval

16. References

Official Documentation and Repositories

Fluent Bit Official Documentation - Complete configuration reference and guides
Fluent Bit GitHub - Source code and issue tracker
Fluent Bit Helm Charts - Helm Charts for Kubernetes deployment
Fluent Bit Performance Tools - Benchmark tools

CNCF Resources

Comparison and Deep Dive Resources

Quiz

Q1: What is the main topic covered in "Fluent Bit Complete Guide: Architecture, Configuration, and Kubernetes Integration for the Lightweight Log Processor"?

A comprehensive guide covering Fluent Bit lightweight architecture and pipeline (Input, Parser, Filter, Buffer, Output) structure, Kubernetes DaemonSet deployment, log routing, parser configuration, Elasticsearch/Loki/S3 integration, Fluentd comparison, performance tuning, and pr...

Q2: Describe the Core Architecture: Pipeline Structure.

2.1 Full Pipeline Flow The Fluent Bit data processing pipeline consists of the following stages: 2.2 Role of Each Stage Input The entry point for data. It collects data from various sources such as files, system journals, network sockets, and Kubernetes events.

Q3: What are the key steps for Installation Methods?

3.1 Linux (apt/yum) Ubuntu/Debian (apt) CentOS/RHEL (yum) 3.2 Docker 3.3 macOS (Homebrew) 3.4 Kubernetes (Helm Chart) 3.5 Direct Binary Installation

Q4: What are the key steps for Configuration File Structure?

Fluent Bit supports two configuration formats: Classic mode (INI format) and YAML mode. YAML has been the standard configuration format since v3.2, and Classic mode is scheduled for deprecation by the end of 2025.

Q5: How does Input Plugin Details work?

Input plugins are the starting point for data collection. Each Input is assigned a unique Tag that serves as the matching criterion for subsequent Filters and Outputs.