Split View: Linux 커널 파라미터 튜닝 가이드: sysctl + Boot Params, 안전한 변경과 롤백

Linux 커널 파라미터 튜닝 가이드: sysctl + Boot Params, 안전한 변경과 롤백

들어가며

커널 파라미터 튜닝은 서버 성능과 안정성에 직접적 영향을 미치는 작업이다. 잘못된 값 하나가 OOM Kill을 유발하거나, 네트워크 연결을 끊거나, 보안 취약점을 만들 수 있다.

이 글에서는 **sysctl(런타임 파라미터)**과 **부트 파라미터(커널 커맨드라인)**를 구분하여 각 항목의 의미·권장값·적용 방법을 정리하고, 프로덕션 환경에서의 안전한 변경 절차와 롤백 전략을 제시한다.

1. 커널 파라미터의 두 가지 경로

구분	sysctl (런타임)	Boot Params (부트 시)
적용 시점	즉시 (재부팅 불필요)	다음 부팅 시
설정 파일	`/etc/sysctl.d/*.conf`	`/etc/default/grub` → `grub.cfg`
확인 명령	`sysctl <param>`	`cat /proc/cmdline`
영구 적용	`sysctl.d` + `sysctl -p`	`grub2-mkconfig` / `update-grub`
롤백	이전 값 복원	GRUB 이전 엔트리 선택
범위	`/proc/sys/` 하위 항목	커널 부팅 옵션 전체

2. 안전한 변경 절차 (프로덕션 프로토콜)

2.1 변경 전 체크리스트

#!/usr/bin/env bash
# pre-tuning-check.sh - 튜닝 전 상태 백업

BACKUP_DIR="/root/kernel-tuning-backup/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"

# 1. 현재 sysctl 전체 덤프
sysctl -a > "$BACKUP_DIR/sysctl-before.txt" 2>/dev/null

# 2. 현재 부트 파라미터
cat /proc/cmdline > "$BACKUP_DIR/cmdline-before.txt"

# 3. GRUB 설정 백업
cp /etc/default/grub "$BACKUP_DIR/grub-before"
[[ -d /etc/sysctl.d ]] && cp -r /etc/sysctl.d "$BACKUP_DIR/sysctl.d-before"

# 4. 시스템 상태 스냅샷
free -h > "$BACKUP_DIR/memory-before.txt"
ss -s > "$BACKUP_DIR/socket-stats-before.txt"
vmstat 1 5 > "$BACKUP_DIR/vmstat-before.txt"
cat /proc/net/sockstat > "$BACKUP_DIR/sockstat-before.txt"

echo "백업 완료: $BACKUP_DIR"

2.2 변경 단계

1. 스테이징 환경에서 테스트
2. Canary 서버 1대에 적용 → 모니터링 (최소 24시간)
3. 문제 없으면 그룹 단위 Rolling 적용
4. 적용 후 메트릭 비교 (before vs after)

2.3 롤백 절차

# sysctl 롤백 - 백업에서 특정 파라미터 복원
PARAM="net.core.somaxconn"
OLD_VALUE=$(grep "^${PARAM}" /root/kernel-tuning-backup/latest/sysctl-before.txt | awk '{print $3}')
sysctl -w "${PARAM}=${OLD_VALUE}"

# 전체 sysctl 롤백
while IFS='= ' read -r key value; do
  sysctl -w "${key}=${value}" 2>/dev/null
done < /root/kernel-tuning-backup/latest/sysctl-before.txt

# 부트 파라미터 롤백 - GRUB 이전 설정 복원
cp /root/kernel-tuning-backup/latest/grub-before /etc/default/grub
grub2-mkconfig -o /boot/grub2/grub.cfg  # RHEL
# update-grub                            # Ubuntu

3. 네트워크 튜닝

3.1 TCP 연결 관리

# /etc/sysctl.d/10-network.conf

# TCP 백로그 - 고트래픽 서버 필수
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535

# TCP 소켓 버퍼 (바이트)
# min / default / max
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216
net.ipv4.tcp_wmem = 4096 262144 16777216

# TCP 혼잡 제어
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

# TIME_WAIT 관리
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_max_tw_buckets = 2000000

# Keepalive (로드밸런서 뒤 서버)
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

3.2 네트워크 파라미터 설명표

파라미터	기본값	권장값	설명
`net.core.somaxconn`	4096	65535	listen() 백로그 최대값
`net.ipv4.tcp_max_syn_backlog`	1024	65535	SYN 큐 최대 크기
`net.core.rmem_max`	212992	16MB	수신 소켓 버퍼 최대
`net.core.wmem_max`	212992	16MB	송신 소켓 버퍼 최대
`net.ipv4.tcp_congestion_control`	cubic	bbr	혼잡 제어 알고리즘
`net.ipv4.tcp_tw_reuse`	0(2)	1	TIME_WAIT 소켓 재사용
`net.ipv4.tcp_fin_timeout`	60	15	FIN-WAIT-2 타임아웃
`net.ipv4.tcp_keepalive_time`	7200	60	Keepalive 시작 시간(초)
`net.ipv4.ip_local_port_range`	32768-60999	1024-65535	아웃바운드 포트 범위

3.3 BBR 활성화

# BBR 커널 모듈 로드
modprobe tcp_bbr
echo "tcp_bbr" >> /etc/modules-load.d/bbr.conf

# sysctl 적용
sysctl -w net.core.default_qdisc=fq
sysctl -w net.ipv4.tcp_congestion_control=bbr

# 확인
sysctl net.ipv4.tcp_congestion_control
# net.ipv4.tcp_congestion_control = bbr

BBR vs CUBIC: BBR은 패킷 손실 기반이 아닌 대역폭 추정 기반 혼잡 제어로, 특히 장거리·고지연 네트워크에서 성능이 크게 향상된다.

4. 메모리 튜닝

4.1 가상 메모리 관리

# /etc/sysctl.d/20-memory.conf

# Swap 사용 성향 (0=최소, 100=적극)
# DB 서버: 1-10, 웹 서버: 10-30
vm.swappiness = 10

# Dirty page 비율 - 디스크 쓰기 지연
vm.dirty_ratio = 40              # 전체 메모리 대비 dirty page 최대 비율
vm.dirty_background_ratio = 10   # 백그라운드 flush 시작 비율

# OOM 관련
vm.overcommit_memory = 0         # 0=기본(휴리스틱), 1=항상허용, 2=제한
vm.panic_on_oom = 0              # OOM 시 패닉 여부 (0=OOM Killer 실행)

# 최대 메모리 맵 영역 (Elasticsearch, MongoDB 등)
vm.max_map_count = 262144

# 파일시스템 캐시 해제 (긴급 시에만)
# echo 3 > /proc/sys/vm/drop_caches  # 1=pagecache, 2=dentries+inodes, 3=all

4.2 메모리 파라미터 가이드

파라미터	DB 서버	웹/API 서버	ML 워크로드
`vm.swappiness`	1~5	10~30	1
`vm.dirty_ratio`	40	20	40
`vm.dirty_background_ratio`	10	5	10
`vm.overcommit_memory`	0	0	1
`vm.max_map_count`	262144	65530	262144

4.3 Huge Pages

# Transparent Huge Pages (THP) - DB에서는 비활성화 권장
# 부트 파라미터로 설정
# GRUB_CMDLINE_LINUX에 추가:
# transparent_hugepage=never

# 런타임 확인·변경
cat /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/enabled

# Static Huge Pages (Oracle DB, DPDK 등)
# /etc/sysctl.d/20-memory.conf
vm.nr_hugepages = 1024  # 2MB * 1024 = 2GB

# 확인
grep -i huge /proc/meminfo

5. 파일시스템·I/O 튜닝

# /etc/sysctl.d/30-fs.conf

# 최대 열린 파일 수 (시스템 전체)
fs.file-max = 2097152

# inotify 감시 제한 (IDE, 파일 감시 서비스)
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 8192

# AIO (비동기 I/O) 최대 요청 수
fs.aio-max-nr = 1048576

ulimit 연동

# /etc/security/limits.d/99-app.conf
# sysctl의 fs.file-max 와 함께 설정해야 효과

*    soft    nofile    1048576
*    hard    nofile    1048576
*    soft    nproc     65535
*    hard    nproc     65535
*    soft    memlock   unlimited
*    hard    memlock   unlimited

I/O 스케줄러 설정

# 현재 스케줄러 확인
cat /sys/block/sda/queue/scheduler

# SSD: none 또는 mq-deadline 권장
echo mq-deadline > /sys/block/sda/queue/scheduler

# 영구 설정 (udev rule)
# /etc/udev/rules.d/60-scheduler.rules
# ACTION=="add|change", KERNEL=="sd*", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"
# ACTION=="add|change", KERNEL=="sd*", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"

스케줄러	디스크 타입	특징
`none` (noop)	NVMe SSD	오버헤드 최소
`mq-deadline`	SATA SSD	지연 시간 보장
`bfq`	HDD	공정 대역폭 분배
`kyber`	고속 SSD	읽기/쓰기 지연 제어

6. 보안 관련 파라미터

# /etc/sysctl.d/40-security.conf

# ASLR (Address Space Layout Randomization)
kernel.randomize_va_space = 2  # 0=끔, 1=부분, 2=완전

# SysRq 제한 (긴급 복구용만 허용)
kernel.sysrq = 176  # 비트마스크: sync + remount-ro + reboot

# 코어 덤프 제한
kernel.core_pattern = |/bin/false
fs.suid_dumpable = 0

# dmesg 접근 제한
kernel.dmesg_restrict = 1

# 커널 포인터 숨김
kernel.kptr_restrict = 2

# BPF 제한 (비특권 사용자)
kernel.unprivileged_bpf_disabled = 1

# 네트워크 보안
net.ipv4.conf.all.rp_filter = 1           # Reverse Path Filtering
net.ipv4.conf.all.accept_redirects = 0     # ICMP 리다이렉트 거부
net.ipv4.conf.all.send_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0  # Source Routing 거부
net.ipv4.conf.all.log_martians = 1         # 의심 패킷 로깅
net.ipv4.icmp_echo_ignore_broadcasts = 1   # Smurf 공격 방어

# IP 포워딩 (라우터/컨테이너 호스트가 아니면 비활성)
net.ipv4.ip_forward = 0
# 컨테이너 호스트일 경우:
# net.ipv4.ip_forward = 1

보안 파라미터 체크리스트

파라미터	보안 권장값	CIS Benchmark	비고
`kernel.randomize_va_space`	2	필수	ASLR 완전 활성화
`kernel.dmesg_restrict`	1	권장	일반 사용자 dmesg 차단
`kernel.kptr_restrict`	2	권장	커널 주소 노출 방지
`net.ipv4.conf.all.rp_filter`	1	필수	IP 스푸핑 방지
`net.ipv4.conf.all.accept_redirects`	0	필수	MITM 방지
`net.ipv4.conf.all.log_martians`	1	권장	비정상 패킷 감사
`fs.suid_dumpable`	0	필수	SUID 코어덤프 방지

7. 부트 파라미터 (Kernel Command Line)

7.1 설정 방법

# 현재 부트 파라미터 확인
cat /proc/cmdline

# RHEL / Rocky
vi /etc/default/grub
# GRUB_CMDLINE_LINUX="... 추가할_파라미터"
grub2-mkconfig -o /boot/grub2/grub.cfg

# Ubuntu
vi /etc/default/grub
# GRUB_CMDLINE_LINUX_DEFAULT="... 추가할_파라미터"
update-grub

7.2 주요 부트 파라미터

파라미터	값	용도
`transparent_hugepage=never`	never / always / madvise	DB 서버 THP 비활성화
`mitigations=auto`	off / auto / auto,nosmt	CPU 취약점 완화 제어
`numa_balancing=disable`	disable / enable	NUMA 자동 밸런싱
`isolcpus=2-7`	CPU 목록	특정 CPU를 스케줄러에서 격리
`nohz_full=2-7`	CPU 목록	Tick-less 모드 (실시간 워크로드)
`intel_iommu=on`	on / off	IOMMU 활성화 (SR-IOV, VFIO)
`iommu=pt`	pt / off	IOMMU pass-through
`default_hugepagesz=1G`	2M / 1G	기본 Huge Page 크기
`hugepagesz=1G hugepages=16`	크기 + 개수	1GB Huge Page 할당
`crashkernel=256M`	크기	kdump 메모리 예약
`audit=1`	0 / 1	커널 감사 로그

7.3 CPU 취약점 완화 vs 성능

# 현재 적용된 완화 목록 확인
grep -r . /sys/devices/system/cpu/vulnerabilities/ 2>/dev/null

# 완화 비활성화 (벤치마크·격리 환경에서만!)
# GRUB_CMDLINE_LINUX에 추가:
# mitigations=off

# 성능 영향 (워크로드에 따라 다름)
# mitigations=auto: 시스콜 집약 워크로드에서 5~30% 오버헤드
# mitigations=off: 보안 위험 - 프로덕션 비권장

주의: mitigations=off는 Spectre/Meltdown/MDS 등의 보안 완화를 모두 비활성화한다. 격리된 벤치마크 환경에서만 사용하고, 프로덕션에서는 절대 사용하지 마라.

8. 워크로드별 튜닝 프로파일

8.1 웹 서버 / API 서버

# /etc/sysctl.d/99-web-server.conf

# 네트워크
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# 파일
fs.file-max = 2097152

# 메모리
vm.swappiness = 10

8.2 데이터베이스 서버

# /etc/sysctl.d/99-database.conf

# 메모리
vm.swappiness = 1
vm.dirty_ratio = 40
vm.dirty_background_ratio = 10
vm.overcommit_memory = 0
vm.max_map_count = 262144

# 파일
fs.file-max = 2097152
fs.aio-max-nr = 1048576

# 네트워크 (내부 통신 위주)
net.core.somaxconn = 65535
net.ipv4.tcp_keepalive_time = 60

# Huge Pages (PostgreSQL, Oracle 등)
# vm.nr_hugepages 계산: shared_buffers / 2MB + 약간의 여유
# 예: shared_buffers=8GB → vm.nr_hugepages = 4200
vm.nr_hugepages = 4200

부트 파라미터:

transparent_hugepage=never

8.3 컨테이너 호스트 (Docker/K8s)

# /etc/sysctl.d/99-container-host.conf

# IP 포워딩 필수
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

# 네트워크
net.core.somaxconn = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.netfilter.nf_conntrack_max = 1048576

# inotify (Pod 다수 실행 시)
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 8192

# PID 제한
kernel.pid_max = 4194304

# 파일
fs.file-max = 2097152

9. 자동화와 검증

Ansible로 sysctl 관리

# roles/sysctl/tasks/main.yml
- name: Apply sysctl parameters
  ansible.posix.sysctl:
    name: "{{ item.key }}"
    value: "{{ item.value }}"
    sysctl_file: /etc/sysctl.d/99-tuning.conf
    reload: true
    state: present
  loop: "{{ sysctl_params | dict2items }}"

# roles/sysctl/defaults/main.yml
sysctl_params:
  net.core.somaxconn: 65535
  net.ipv4.tcp_max_syn_backlog: 65535
  vm.swappiness: 10
  fs.file-max: 2097152

적용 후 검증 스크립트

#!/usr/bin/env bash
# verify-tuning.sh - 튜닝 값 검증

declare -A EXPECTED=(
  ["net.core.somaxconn"]="65535"
  ["net.ipv4.tcp_congestion_control"]="bbr"
  ["vm.swappiness"]="10"
  ["fs.file-max"]="2097152"
)

FAILED=0
for param in "${!EXPECTED[@]}"; do
  actual=$(sysctl -n "$param" 2>/dev/null)
  expected="${EXPECTED[$param]}"
  if [[ "$actual" != "$expected" ]]; then
    echo "FAIL: $param = $actual (expected: $expected)"
    (( FAILED++ ))
  else
    echo "OK:   $param = $actual"
  fi
done

echo "---"
if (( FAILED > 0 )); then
  echo "검증 실패: ${FAILED}건"
  exit 1
else
  echo "모든 파라미터 검증 통과"
fi

10. 트러블슈팅

증상	확인 명령	관련 파라미터
"Too many open files"	`ulimit -n`, `sysctl fs.file-max`	`fs.file-max`, limits.conf
"Connection refused" (백로그 초과)	`ss -lnt`, `netstat -s \| grep overflow`	`net.core.somaxconn`
TIME_WAIT 폭증	`ss -s`	`tcp_tw_reuse`, `tcp_fin_timeout`
OOM Kill 빈발	`dmesg \| grep -i oom`, `/proc/meminfo`	`vm.swappiness`, `vm.overcommit_memory`
높은 I/O wait	`iostat -x 1`, `vmstat 1`	`vm.dirty_ratio`, I/O 스케줄러
"Cannot allocate memory" (mmap)	`sysctl vm.max_map_count`	`vm.max_map_count`
nf_conntrack 테이블 포화	`dmesg \| grep conntrack`, `sysctl net.netfilter.nf_conntrack_count`	`nf_conntrack_max`

마무리

커널 파라미터 튜닝의 핵심 원칙을 정리한다.

측정 먼저, 튜닝 나중: 병목이 확인되지 않은 상태에서 값을 변경하지 마라.
한 번에 하나씩: 여러 파라미터를 동시에 변경하면 효과를 분리할 수 없다.
반드시 백업: 변경 전 현재 값을 기록하라. 롤백 경로 없는 변경은 하지 마라.
Canary 적용: 전체 서버에 한꺼번에 적용하지 말고 1~2대에서 먼저 검증하라.
문서화: 왜 이 값으로 변경했는지, 어떤 효과를 확인했는지 기록하라.

올바른 튜닝은 서버 성능을 극적으로 개선할 수 있지만, 잘못된 튜닝은 장애의 직접 원인이 된다. 항상 안전하게, 점진적으로, 측정 가능하게 접근하자.

Linux Kernel Parameter Tuning Guide: sysctl + Boot Params, Safe Changes and Rollback

Introduction

Kernel parameter tuning is a task that directly impacts server performance and stability. A single incorrect value can trigger OOM kills, drop network connections, or create security vulnerabilities.

This article distinguishes between sysctl (runtime parameters) and boot parameters (kernel command line), covering the meaning, recommended values, and application methods for each, and presents safe change procedures and rollback strategies for production environments.

1. Two Paths for Kernel Parameters

Category	sysctl (Runtime)	Boot Params (Boot Time)
When Applied	Immediately (no reboot)	At next boot
Config File	`/etc/sysctl.d/*.conf`	`/etc/default/grub` -> `grub.cfg`
Check Command	`sysctl <param>`	`cat /proc/cmdline`
Persist	`sysctl.d` + `sysctl -p`	`grub2-mkconfig` / `update-grub`
Rollback	Restore previous values	Select previous GRUB entry
Scope	Items under `/proc/sys/`	All kernel boot options

2. Safe Change Procedures (Production Protocol)

2.1 Pre-Change Checklist

#!/usr/bin/env bash
# pre-tuning-check.sh - Back up state before tuning

BACKUP_DIR="/root/kernel-tuning-backup/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"

# 1. Full sysctl dump
sysctl -a > "$BACKUP_DIR/sysctl-before.txt" 2>/dev/null

# 2. Current boot parameters
cat /proc/cmdline > "$BACKUP_DIR/cmdline-before.txt"

# 3. GRUB config backup
cp /etc/default/grub "$BACKUP_DIR/grub-before"
[[ -d /etc/sysctl.d ]] && cp -r /etc/sysctl.d "$BACKUP_DIR/sysctl.d-before"

# 4. System state snapshot
free -h > "$BACKUP_DIR/memory-before.txt"
ss -s > "$BACKUP_DIR/socket-stats-before.txt"
vmstat 1 5 > "$BACKUP_DIR/vmstat-before.txt"
cat /proc/net/sockstat > "$BACKUP_DIR/sockstat-before.txt"

echo "Backup complete: $BACKUP_DIR"

2.2 Change Steps

1. Test in staging environment
2. Apply to 1 canary server -> Monitor (minimum 24 hours)
3. If no issues, rolling apply by group
4. Compare metrics after application (before vs after)

2.3 Rollback Procedure

# sysctl rollback - Restore specific parameter from backup
PARAM="net.core.somaxconn"
OLD_VALUE=$(grep "^${PARAM}" /root/kernel-tuning-backup/latest/sysctl-before.txt | awk '{print $3}')
sysctl -w "${PARAM}=${OLD_VALUE}"

# Full sysctl rollback
while IFS='= ' read -r key value; do
  sysctl -w "${key}=${value}" 2>/dev/null
done < /root/kernel-tuning-backup/latest/sysctl-before.txt

# Boot parameter rollback - Restore previous GRUB config
cp /root/kernel-tuning-backup/latest/grub-before /etc/default/grub
grub2-mkconfig -o /boot/grub2/grub.cfg  # RHEL
# update-grub                            # Ubuntu

3. Network Tuning

3.1 TCP Connection Management

# /etc/sysctl.d/10-network.conf

# TCP backlog - Essential for high-traffic servers
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535

# TCP socket buffers (bytes)
# min / default / max
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216
net.ipv4.tcp_wmem = 4096 262144 16777216

# TCP congestion control
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

# TIME_WAIT management
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_max_tw_buckets = 2000000

# Keepalive (servers behind load balancers)
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

3.2 Network Parameter Reference Table

Parameter	Default	Recommended	Description
`net.core.somaxconn`	4096	65535	Max listen() backlog
`net.ipv4.tcp_max_syn_backlog`	1024	65535	Max SYN queue size
`net.core.rmem_max`	212992	16MB	Max receive socket buffer
`net.core.wmem_max`	212992	16MB	Max send socket buffer
`net.ipv4.tcp_congestion_control`	cubic	bbr	Congestion control algo
`net.ipv4.tcp_tw_reuse`	0(2)	1	TIME_WAIT socket reuse
`net.ipv4.tcp_fin_timeout`	60	15	FIN-WAIT-2 timeout
`net.ipv4.tcp_keepalive_time`	7200	60	Keepalive start time (sec)
`net.ipv4.ip_local_port_range`	32768-60999	1024-65535	Outbound port range

3.3 Enabling BBR

# Load BBR kernel module
modprobe tcp_bbr
echo "tcp_bbr" >> /etc/modules-load.d/bbr.conf

# Apply sysctl
sysctl -w net.core.default_qdisc=fq
sysctl -w net.ipv4.tcp_congestion_control=bbr

# Verify
sysctl net.ipv4.tcp_congestion_control
# net.ipv4.tcp_congestion_control = bbr

BBR vs CUBIC: BBR uses bandwidth estimation-based congestion control rather than packet loss-based, significantly improving performance especially on long-distance, high-latency networks.

4. Memory Tuning

4.1 Virtual Memory Management

# /etc/sysctl.d/20-memory.conf

# Swap tendency (0=minimal, 100=aggressive)
# DB servers: 1-10, Web servers: 10-30
vm.swappiness = 10

# Dirty page ratio - disk write delay
vm.dirty_ratio = 40              # Max dirty page ratio relative to total memory
vm.dirty_background_ratio = 10   # Background flush start ratio

# OOM-related
vm.overcommit_memory = 0         # 0=default(heuristic), 1=always allow, 2=restrict
vm.panic_on_oom = 0              # Whether to panic on OOM (0=run OOM Killer)

# Max memory map areas (Elasticsearch, MongoDB, etc.)
vm.max_map_count = 262144

# Filesystem cache release (emergency only)
# echo 3 > /proc/sys/vm/drop_caches  # 1=pagecache, 2=dentries+inodes, 3=all

4.2 Memory Parameter Guide

Parameter	DB Server	Web/API Server	ML Workload
`vm.swappiness`	1~5	10~30	1
`vm.dirty_ratio`	40	20	40
`vm.dirty_background_ratio`	10	5	10
`vm.overcommit_memory`	0	0	1
`vm.max_map_count`	262144	65530	262144

4.3 Huge Pages

# Transparent Huge Pages (THP) - Recommended to disable for DB
# Set via boot parameter
# Add to GRUB_CMDLINE_LINUX:
# transparent_hugepage=never

# Runtime check/change
cat /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/enabled

# Static Huge Pages (Oracle DB, DPDK, etc.)
# /etc/sysctl.d/20-memory.conf
vm.nr_hugepages = 1024  # 2MB * 1024 = 2GB

# Verify
grep -i huge /proc/meminfo

5. Filesystem and I/O Tuning

# /etc/sysctl.d/30-fs.conf

# Max open files (system-wide)
fs.file-max = 2097152

# inotify watch limits (IDE, file watch services)
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 8192

# AIO (Asynchronous I/O) max request count
fs.aio-max-nr = 1048576

ulimit Integration

# /etc/security/limits.d/99-app.conf
# Must be configured together with sysctl's fs.file-max to take effect

*    soft    nofile    1048576
*    hard    nofile    1048576
*    soft    nproc     65535
*    hard    nproc     65535
*    soft    memlock   unlimited
*    hard    memlock   unlimited

I/O Scheduler Configuration

# Check current scheduler
cat /sys/block/sda/queue/scheduler

# SSD: none or mq-deadline recommended
echo mq-deadline > /sys/block/sda/queue/scheduler

# Persistent config (udev rule)
# /etc/udev/rules.d/60-scheduler.rules
# ACTION=="add|change", KERNEL=="sd*", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"
# ACTION=="add|change", KERNEL=="sd*", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"

Scheduler	Disk Type	Characteristics
`none` (noop)	NVMe SSD	Minimal overhead
`mq-deadline`	SATA SSD	Guaranteed latency
`bfq`	HDD	Fair bandwidth allocation
`kyber`	Fast SSD	Read/write latency control

# /etc/sysctl.d/40-security.conf

# ASLR (Address Space Layout Randomization)
kernel.randomize_va_space = 2  # 0=off, 1=partial, 2=full

# SysRq restriction (allow emergency recovery only)
kernel.sysrq = 176  # Bitmask: sync + remount-ro + reboot

# Core dump restriction
kernel.core_pattern = |/bin/false
fs.suid_dumpable = 0

# dmesg access restriction
kernel.dmesg_restrict = 1

# Kernel pointer hiding
kernel.kptr_restrict = 2

# BPF restriction (unprivileged users)
kernel.unprivileged_bpf_disabled = 1

# Network security
net.ipv4.conf.all.rp_filter = 1           # Reverse Path Filtering
net.ipv4.conf.all.accept_redirects = 0     # Reject ICMP redirects
net.ipv4.conf.all.send_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0  # Reject Source Routing
net.ipv4.conf.all.log_martians = 1         # Log suspicious packets
net.ipv4.icmp_echo_ignore_broadcasts = 1   # Smurf attack defense

# IP forwarding (disable unless router/container host)
net.ipv4.ip_forward = 0
# For container hosts:
# net.ipv4.ip_forward = 1

Security Parameter Checklist

Parameter	Secure Value	CIS Benchmark	Notes
`kernel.randomize_va_space`	2	Required	Full ASLR enabled
`kernel.dmesg_restrict`	1	Recommended	Block dmesg for regular users
`kernel.kptr_restrict`	2	Recommended	Prevent kernel address exposure
`net.ipv4.conf.all.rp_filter`	1	Required	Prevent IP spoofing
`net.ipv4.conf.all.accept_redirects`	0	Required	Prevent MITM attacks
`net.ipv4.conf.all.log_martians`	1	Recommended	Audit abnormal packets
`fs.suid_dumpable`	0	Required	Prevent SUID core dumps

7. Boot Parameters (Kernel Command Line)

7.1 Configuration Method

# Check current boot parameters
cat /proc/cmdline

# RHEL / Rocky
vi /etc/default/grub
# GRUB_CMDLINE_LINUX="... parameters_to_add"
grub2-mkconfig -o /boot/grub2/grub.cfg

# Ubuntu
vi /etc/default/grub
# GRUB_CMDLINE_LINUX_DEFAULT="... parameters_to_add"
update-grub

7.2 Key Boot Parameters

Parameter	Value	Purpose
`transparent_hugepage=never`	never / always / madvise	Disable THP for DB servers
`mitigations=auto`	off / auto / auto,nosmt	CPU vulnerability mitigation
`numa_balancing=disable`	disable / enable	NUMA auto-balancing
`isolcpus=2-7`	CPU list	Isolate specific CPUs from scheduler
`nohz_full=2-7`	CPU list	Tick-less mode (real-time workloads)
`intel_iommu=on`	on / off	Enable IOMMU (SR-IOV, VFIO)
`iommu=pt`	pt / off	IOMMU pass-through
`default_hugepagesz=1G`	2M / 1G	Default Huge Page size
`hugepagesz=1G hugepages=16`	size + count	1GB Huge Page allocation
`crashkernel=256M`	size	kdump memory reservation
`audit=1`	0 / 1	Kernel audit logging

7.3 CPU Vulnerability Mitigation vs Performance

# Check currently applied mitigations
grep -r . /sys/devices/system/cpu/vulnerabilities/ 2>/dev/null

# Disable mitigations (benchmark/isolated environments only!)
# Add to GRUB_CMDLINE_LINUX:
# mitigations=off

# Performance impact (varies by workload)
# mitigations=auto: 5~30% overhead on syscall-intensive workloads
# mitigations=off: Security risk - not recommended for production

Warning: mitigations=off disables all security mitigations including Spectre/Meltdown/MDS. Use only in isolated benchmark environments and never in production.

8. Workload-Specific Tuning Profiles

8.1 Web Server / API Server

# /etc/sysctl.d/99-web-server.conf

# Network
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# Files
fs.file-max = 2097152

# Memory
vm.swappiness = 10

8.2 Database Server

# /etc/sysctl.d/99-database.conf

# Memory
vm.swappiness = 1
vm.dirty_ratio = 40
vm.dirty_background_ratio = 10
vm.overcommit_memory = 0
vm.max_map_count = 262144

# Files
fs.file-max = 2097152
fs.aio-max-nr = 1048576

# Network (primarily internal communication)
net.core.somaxconn = 65535
net.ipv4.tcp_keepalive_time = 60

# Huge Pages (PostgreSQL, Oracle, etc.)
# vm.nr_hugepages calculation: shared_buffers / 2MB + some headroom
# Example: shared_buffers=8GB -> vm.nr_hugepages = 4200
vm.nr_hugepages = 4200

Boot parameters:

transparent_hugepage=never

8.3 Container Host (Docker/K8s)

# /etc/sysctl.d/99-container-host.conf

# IP forwarding required
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1

# Network
net.core.somaxconn = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.netfilter.nf_conntrack_max = 1048576

# inotify (when running many Pods)
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 8192

# PID limit
kernel.pid_max = 4194304

# Files
fs.file-max = 2097152

9. Automation and Verification

Managing sysctl with Ansible

# roles/sysctl/tasks/main.yml
- name: Apply sysctl parameters
  ansible.posix.sysctl:
    name: "{{ item.key }}"
    value: "{{ item.value }}"
    sysctl_file: /etc/sysctl.d/99-tuning.conf
    reload: true
    state: present
  loop: "{{ sysctl_params | dict2items }}"

# roles/sysctl/defaults/main.yml
sysctl_params:
  net.core.somaxconn: 65535
  net.ipv4.tcp_max_syn_backlog: 65535
  vm.swappiness: 10
  fs.file-max: 2097152

Post-Application Verification Script

#!/usr/bin/env bash
# verify-tuning.sh - Verify tuning values

declare -A EXPECTED=(
  ["net.core.somaxconn"]="65535"
  ["net.ipv4.tcp_congestion_control"]="bbr"
  ["vm.swappiness"]="10"
  ["fs.file-max"]="2097152"
)

FAILED=0
for param in "${!EXPECTED[@]}"; do
  actual=$(sysctl -n "$param" 2>/dev/null)
  expected="${EXPECTED[$param]}"
  if [[ "$actual" != "$expected" ]]; then
    echo "FAIL: $param = $actual (expected: $expected)"
    (( FAILED++ ))
  else
    echo "OK:   $param = $actual"
  fi
done

echo "---"
if (( FAILED > 0 )); then
  echo "Verification failed: ${FAILED} item(s)"
  exit 1
else
  echo "All parameters verified successfully"
fi

10. Troubleshooting

Symptom	Check Command	Related Parameter
"Too many open files"	`ulimit -n`, `sysctl fs.file-max`	`fs.file-max`, limits.conf
"Connection refused" (backlog full)	`ss -lnt`, `netstat -s \| grep overflow`	`net.core.somaxconn`
TIME_WAIT explosion	`ss -s`	`tcp_tw_reuse`, `tcp_fin_timeout`
Frequent OOM kills	`dmesg \| grep -i oom`, `/proc/meminfo`	`vm.swappiness`, `vm.overcommit_memory`
High I/O wait	`iostat -x 1`, `vmstat 1`	`vm.dirty_ratio`, I/O scheduler
"Cannot allocate memory" (mmap)	`sysctl vm.max_map_count`	`vm.max_map_count`
nf_conntrack table full	`dmesg \| grep conntrack`, `sysctl net.netfilter.nf_conntrack_count`	`nf_conntrack_max`

Conclusion

Here are the key principles of kernel parameter tuning:

Measure first, tune later: Do not change values unless a bottleneck has been identified.
One at a time: Changing multiple parameters simultaneously makes it impossible to isolate effects.
Always back up: Record current values before making changes. Never make changes without a rollback path.
Canary deployment: Do not apply to all servers at once -- verify on 1-2 servers first.
Document everything: Record why you changed to this value and what effect was observed.

Proper tuning can dramatically improve server performance, but incorrect tuning can directly cause outages. Always approach tuning safely, incrementally, and measurably.

Quiz

Q1: What is the main topic covered in "Linux Kernel Parameter Tuning Guide: sysctl + Boot Params, Safe Changes and Rollback"?

This guide covers Linux kernel tuning strategies using sysctl runtime parameters and boot parameters. It provides recommended values for networking, memory, filesystem, and security categories, along with pre-change backup/verification/rollback procedures and a production deploym...

Q2: What is Network Tuning?

3.1 TCP Connection Management 3.2 Network Parameter Reference Table 3.3 Enabling BBR

Q3: Explain the core concept of Memory Tuning.

4.1 Virtual Memory Management 4.2 Memory Parameter Guide 4.3 Huge Pages

Q4: What are the key aspects of Filesystem and I/O Tuning?

ulimit Integration I/O Scheduler Configuration

Q5: How does Security-Related Parameters work?

Security Parameter Checklist

Linux 커널 파라미터 튜닝 가이드: sysctl + Boot Params, 안전한 변경과 롤백

들어가며

1. 커널 파라미터의 두 가지 경로

2. 안전한 변경 절차 (프로덕션 프로토콜)

2.1 변경 전 체크리스트

2.2 변경 단계

2.3 롤백 절차

3. 네트워크 튜닝

3.1 TCP 연결 관리

3.2 네트워크 파라미터 설명표

3.3 BBR 활성화

4. 메모리 튜닝

4.1 가상 메모리 관리

4.2 메모리 파라미터 가이드

4.3 Huge Pages

5. 파일시스템·I/O 튜닝

ulimit 연동

I/O 스케줄러 설정

6. 보안 관련 파라미터

보안 파라미터 체크리스트

7. 부트 파라미터 (Kernel Command Line)

7.1 설정 방법

7.2 주요 부트 파라미터

7.3 CPU 취약점 완화 vs 성능

8. 워크로드별 튜닝 프로파일

8.1 웹 서버 / API 서버

8.2 데이터베이스 서버

8.3 컨테이너 호스트 (Docker/K8s)

9. 자동화와 검증

Ansible로 sysctl 관리

적용 후 검증 스크립트

10. 트러블슈팅

마무리

Linux Kernel Parameter Tuning Guide: sysctl + Boot Params, Safe Changes and Rollback

Introduction

1. Two Paths for Kernel Parameters

2. Safe Change Procedures (Production Protocol)

2.1 Pre-Change Checklist

2.2 Change Steps

2.3 Rollback Procedure

3. Network Tuning

3.1 TCP Connection Management

3.2 Network Parameter Reference Table

3.3 Enabling BBR

4. Memory Tuning

4.1 Virtual Memory Management

4.2 Memory Parameter Guide

4.3 Huge Pages

5. Filesystem and I/O Tuning

ulimit Integration

I/O Scheduler Configuration

6. Security-Related Parameters

Security Parameter Checklist

7. Boot Parameters (Kernel Command Line)

7.1 Configuration Method

7.2 Key Boot Parameters

7.3 CPU Vulnerability Mitigation vs Performance

8. Workload-Specific Tuning Profiles

8.1 Web Server / API Server

8.2 Database Server

8.3 Container Host (Docker/K8s)

9. Automation and Verification

Managing sysctl with Ansible

Post-Application Verification Script

10. Troubleshooting

Conclusion

Quiz