Skip to content

Split View: Docker & Kubernetes 입문 완전 가이드 — 컨테이너부터 오케스트레이션까지

|

Docker & Kubernetes 입문 완전 가이드 — 컨테이너부터 오케스트레이션까지

1. 컨테이너란 무엇인가

가상 머신과 컨테이너의 차이

전통적인 가상 머신(VM)은 하이퍼바이저 위에서 게스트 OS 전체를 실행합니다. 각 VM마다 커널, 라이브러리, 바이너리를 모두 포함하므로 수 GB의 디스크와 수십 초의 부팅 시간이 필요합니다.

컨테이너는 호스트 OS의 커널을 공유하면서 프로세스 수준으로 격리합니다. 이미지 크기는 수십 MB에 불과하고, 기동 시간은 밀리초 단위입니다.

항목가상 머신컨테이너
격리 수준OS 전체프로세스
이미지 크기수 GB수십~수백 MB
기동 시간수십 초밀리초
리소스 오버헤드높음낮음
이식성하이퍼바이저 의존커널만 일치하면 어디서든

리눅스 네임스페이스와 cgroups

컨테이너의 격리는 두 가지 리눅스 커널 기능으로 구현됩니다.

네임스페이스(Namespace) 는 프로세스가 보는 시스템 리소스의 범위를 제한합니다.

  • PID 네임스페이스: 컨테이너 안에서는 PID 1부터 시작하는 독립된 프로세스 트리를 봅니다
  • NET 네임스페이스: 독립된 네트워크 인터페이스, IP 주소, 라우팅 테이블을 가집니다
  • MNT 네임스페이스: 독립된 파일시스템 마운트 포인트를 가집니다
  • UTS 네임스페이스: 독립된 호스트네임을 가집니다
  • IPC 네임스페이스: 독립된 IPC 리소스를 가집니다
  • USER 네임스페이스: 독립된 UID/GID 매핑을 가집니다

cgroups(Control Groups) 는 CPU, 메모리, 디스크 I/O 등 하드웨어 리소스 사용량을 제한합니다.

# cgroup으로 메모리 제한 확인하기
cat /sys/fs/cgroup/memory/docker/CONTAINER_ID/memory.limit_in_bytes

이 두 기능 덕분에 컨테이너는 마치 독립된 머신처럼 동작하면서도 커널을 공유하여 가벼운 상태를 유지합니다.


2. Docker 기초

핵심 개념 3가지

이미지(Image) 는 읽기 전용 템플릿입니다. 애플리케이션 코드, 런타임, 시스템 라이브러리, 설정 파일 등이 레이어 구조로 저장되어 있습니다.

컨테이너(Container) 는 이미지의 실행 인스턴스입니다. 이미지 위에 쓰기 가능한 레이어가 추가됩니다.

레지스트리(Registry) 는 이미지를 저장하고 배포하는 서비스입니다. Docker Hub가 대표적이며, AWS ECR, GitHub Container Registry 등의 프라이빗 레지스트리도 있습니다.

필수 명령어

# 이미지 pull
docker pull nginx:1.25

# 컨테이너 실행
docker run -d --name my-nginx -p 8080:80 nginx:1.25

# 실행 중인 컨테이너 확인
docker ps

# 컨테이너 로그 확인
docker logs my-nginx

# 컨테이너 내부 접속
docker exec -it my-nginx /bin/bash

# 컨테이너 정지 및 삭제
docker stop my-nginx
docker rm my-nginx

# 이미지 빌드
docker build -t my-app:1.0 .

# 이미지를 레지스트리에 푸시
docker tag my-app:1.0 registry.example.com/my-app:1.0
docker push registry.example.com/my-app:1.0

이미지 레이어 구조 이해하기

Docker 이미지는 여러 개의 읽기 전용 레이어로 구성됩니다. Dockerfile의 각 명령어가 하나의 레이어를 만듭니다.

# 이미지 레이어 확인
docker history my-app:1.0

레이어는 캐싱됩니다. 변경되지 않은 레이어는 다시 빌드하지 않으므로, Dockerfile에서 변경 빈도가 낮은 명령어를 위에 배치하는 것이 빌드 속도 최적화의 핵심입니다.


3. Dockerfile 베스트 프랙티스

기본 Dockerfile 구조

# 베이스 이미지 지정
FROM node:20-alpine

# 작업 디렉터리 설정
WORKDIR /app

# 의존성 파일 먼저 복사 (캐시 최적화)
COPY package.json package-lock.json ./
RUN npm ci --only=production

# 소스 코드 복사
COPY . .

# 포트 노출
EXPOSE 3000

# 실행 명령
CMD ["node", "server.js"]

멀티스테이지 빌드

빌드 도구와 런타임 환경을 분리하여 최종 이미지 크기를 대폭 줄일 수 있습니다.

# 1단계: 빌드
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# 2단계: 프로덕션
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]

빌드 단계에서만 필요한 devDependencies, 소스 코드, 빌드 도구가 최종 이미지에 포함되지 않습니다.

레이어 캐시 최적화

# 나쁜 예: 소스가 변경될 때마다 npm install이 다시 실행됨
COPY . .
RUN npm ci

# 좋은 예: package.json이 변경되지 않으면 캐시 활용
COPY package.json package-lock.json ./
RUN npm ci
COPY . .

.dockerignore 파일

불필요한 파일이 빌드 컨텍스트에 포함되지 않도록 합니다.

node_modules
.git
.env
*.md
dist
.DS_Store
coverage

보안 베스트 프랙티스

# 1. 루트가 아닌 사용자로 실행
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

# 2. 특정 버전의 베이스 이미지 사용 (latest 태그 금지)
FROM node:20.11.1-alpine3.19

# 3. COPY를 사용하고 ADD는 피할 것
COPY ./config /app/config

# 4. 민감 정보는 이미지에 포함하지 않기
# 빌드 시 ARG를 사용하되, 최종 이미지에 남지 않도록 주의

4. Docker Compose

다중 컨테이너 오케스트레이션

Docker Compose는 여러 컨테이너를 하나의 YAML 파일로 정의하고 관리합니다.

version: "3.9"

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    depends_on:
      db:
        condition: service_healthy
      cache:
        condition: service_started
    volumes:
      - ./src:/app/src
    networks:
      - backend

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
      POSTGRES_DB: mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend

  cache:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    networks:
      - backend

volumes:
  postgres_data:

networks:
  backend:
    driver: bridge

Compose 핵심 명령어

# 전체 서비스 실행
docker compose up -d

# 로그 확인
docker compose logs -f app

# 특정 서비스 재빌드 후 실행
docker compose up -d --build app

# 서비스 상태 확인
docker compose ps

# 전체 서비스 종료 및 리소스 정리
docker compose down -v

네트워크와 볼륨

네트워크: 같은 Compose 파일 안의 서비스들은 서비스 이름으로 서로 통신할 수 있습니다. 위 예제에서 app 서비스는 db:5432로 데이터베이스에 접근합니다.

볼륨: 컨테이너가 삭제되어도 데이터가 유지됩니다. postgres_data라는 named volume이 데이터베이스 파일을 보관합니다.

개발 환경 vs 프로덕션 환경

# docker-compose.override.yml (개발 환경 자동 적용)
services:
  app:
    build:
      target: development
    volumes:
      - ./src:/app/src
    environment:
      - NODE_ENV=development
    command: npm run dev
# 프로덕션 배포 시 override 제외
docker compose -f docker-compose.yml up -d

5. Kubernetes 아키텍처

왜 Kubernetes인가

Docker Compose는 단일 호스트에서 여러 컨테이너를 관리하기에 좋지만, 실제 프로덕션 환경에서는 다음이 필요합니다.

  • 여러 서버에 걸친 컨테이너 배포
  • 자동 스케일링
  • 서비스 디스커버리와 로드 밸런싱
  • 롤링 업데이트와 롤백
  • 자가 치유(self-healing)

Kubernetes(K8s)는 이 모든 것을 제공하는 컨테이너 오케스트레이션 플랫폼입니다.

Control Plane 컴포넌트

API Server(kube-apiserver): 클러스터의 모든 요청을 처리하는 중앙 관문입니다. kubectl 명령, 내부 컴포넌트, 외부 클라이언트 모두 API Server를 거칩니다.

etcd: 클러스터의 모든 상태를 저장하는 분산 키-값 저장소입니다. 어떤 Pod가 어디에서 실행되는지, 어떤 Service가 존재하는지 등의 정보가 여기에 있습니다.

Scheduler(kube-scheduler): 새로 생성된 Pod를 어떤 노드에 배치할지 결정합니다. 리소스 요구사항, 노드 상태, 어피니티 규칙 등을 고려합니다.

Controller Manager(kube-controller-manager): 클러스터의 현재 상태를 원하는 상태(desired state)에 맞추는 역할을 합니다. ReplicaSet Controller, Node Controller, Job Controller 등이 포함됩니다.

Worker Node 컴포넌트

kubelet: 각 노드에서 실행되며, API Server의 지시에 따라 컨테이너를 실행하고 상태를 보고합니다.

kube-proxy: 각 노드에서 네트워크 규칙을 관리하여 Service의 로드 밸런싱을 처리합니다.

Container Runtime: 실제 컨테이너를 실행하는 소프트웨어입니다. containerd가 가장 널리 사용됩니다.

Control Plane
  +------------------+
  | API Server       |<--- kubectl, 클라이언트
  | etcd             |
  | Scheduler        |
  | Controller Mgr   |
  +------------------+
        |
  Worker Node 1          Worker Node 2
  +-----------------+   +-----------------+
  | kubelet         |   | kubelet         |
  | kube-proxy      |   | kube-proxy      |
  | containerd      |   | containerd      |
  | [Pod] [Pod]     |   | [Pod] [Pod]     |
  +-----------------+   +-----------------+

6. Kubernetes 핵심 오브젝트

Pod

Pod는 K8s에서 배포 가능한 가장 작은 단위입니다. 하나 이상의 컨테이너를 포함하며, 같은 네트워크와 스토리지를 공유합니다.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
  labels:
    app: my-app
spec:
  containers:
    - name: app
      image: my-app:1.0
      ports:
        - containerPort: 3000
      resources:
        requests:
          memory: "128Mi"
          cpu: "250m"
        limits:
          memory: "256Mi"
          cpu: "500m"
      livenessProbe:
        httpGet:
          path: /healthz
          port: 3000
        initialDelaySeconds: 10
        periodSeconds: 5
      readinessProbe:
        httpGet:
          path: /ready
          port: 3000
        initialDelaySeconds: 5
        periodSeconds: 3

ReplicaSet

지정된 수의 Pod 복제본이 항상 실행되도록 보장합니다. 보통 직접 사용하지 않고 Deployment를 통해 관리합니다.

Deployment

선언적으로 Pod와 ReplicaSet을 관리합니다. 롤링 업데이트, 롤백, 스케일링을 지원합니다.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: app
          image: my-app:1.0
          ports:
            - containerPort: 3000
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"
# 배포 상태 확인
kubectl rollout status deployment/my-app

# 이미지 업데이트 (롤링 업데이트 트리거)
kubectl set image deployment/my-app app=my-app:2.0

# 롤백
kubectl rollout undo deployment/my-app

# 스케일링
kubectl scale deployment/my-app --replicas=5

Service

Pod에 안정적인 네트워크 엔드포인트를 제공합니다. Pod는 임시적이지만 Service의 IP와 DNS는 고정입니다.

ClusterIP (기본값): 클러스터 내부에서만 접근 가능합니다.

apiVersion: v1
kind: Service
metadata:
  name: my-app-svc
spec:
  type: ClusterIP
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000

NodePort: 각 노드의 특정 포트를 통해 외부에서 접근합니다.

apiVersion: v1
kind: Service
metadata:
  name: my-app-nodeport
spec:
  type: NodePort
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000
      nodePort: 30080

LoadBalancer: 클라우드 프로바이더의 로드 밸런서를 자동으로 프로비저닝합니다.

apiVersion: v1
kind: Service
metadata:
  name: my-app-lb
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000

Ingress

HTTP/HTTPS 트래픽을 클러스터 내부 Service로 라우팅합니다. 호스트명과 경로 기반의 라우팅 규칙을 정의할 수 있습니다.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app-svc
                port:
                  number: 80
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: api-svc
                port:
                  number: 80
  tls:
    - hosts:
        - app.example.com
      secretName: tls-secret

7. Kubernetes 설정 관리

ConfigMap

환경 설정을 컨테이너 이미지와 분리하여 관리합니다.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_HOST: "db-service"
  DATABASE_PORT: "5432"
  LOG_LEVEL: "info"
  app.properties: |
    server.port=3000
    cache.ttl=300
# Pod에서 ConfigMap 사용
spec:
  containers:
    - name: app
      image: my-app:1.0
      envFrom:
        - configMapRef:
            name: app-config
      volumeMounts:
        - name: config-volume
          mountPath: /app/config
  volumes:
    - name: config-volume
      configMap:
        name: app-config
        items:
          - key: app.properties
            path: app.properties

Secret

패스워드, API 키 등 민감한 데이터를 관리합니다. base64로 인코딩되어 저장됩니다.

# Secret 생성
kubectl create secret generic db-secret \
  --from-literal=username=admin \
  --from-literal=password=s3cret
apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  username: YWRtaW4=
  password: czNjcmV0
# Pod에서 Secret 사용
spec:
  containers:
    - name: app
      env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: username
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: password

PersistentVolume (PV) 와 PersistentVolumeClaim (PVC)

데이터를 Pod 라이프사이클과 독립적으로 유지합니다.

# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard
# Deployment에서 PVC 사용
spec:
  containers:
    - name: postgres
      image: postgres:16-alpine
      volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
  volumes:
    - name: postgres-storage
      persistentVolumeClaim:
        claimName: postgres-pvc

8. Helm - 쿠버네티스 패키지 매니저

Helm이란

Helm은 K8s 매니페스트를 패키지(차트)로 묶어서 관리하는 도구입니다. 여러 YAML 파일을 하나의 단위로 설치, 업그레이드, 롤백할 수 있습니다.

차트 구조

my-app-chart/
  Chart.yaml          # 차트 메타데이터
  values.yaml         # 기본 설정 값
  templates/          # K8s 매니페스트 템플릿
    deployment.yaml
    service.yaml
    ingress.yaml
    configmap.yaml
    _helpers.tpl      # 템플릿 헬퍼 함수
  charts/             # 의존성 차트

values.yaml

# values.yaml
replicaCount: 3

image:
  repository: my-app
  tag: "1.0"
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  hostname: app.example.com

resources:
  requests:
    cpu: 250m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilization: 70

Helm 핵심 명령어

# 차트 설치
helm install my-release ./my-app-chart

# 커스텀 values로 설치
helm install my-release ./my-app-chart -f production-values.yaml

# 릴리즈 업그레이드
helm upgrade my-release ./my-app-chart --set image.tag=2.0

# 릴리즈 목록 확인
helm list

# 릴리즈 상태 확인
helm status my-release

# 릴리즈 히스토리
helm history my-release

# 롤백
helm rollback my-release 1

# 릴리즈 삭제
helm uninstall my-release

환경별 values 파일 관리

# 환경별 파일 구조
values.yaml              # 공통 기본값
values-dev.yaml          # 개발 환경
values-staging.yaml      # 스테이징 환경
values-prod.yaml         # 프로덕션 환경
# 스테이징 배포
helm upgrade --install my-app ./my-app-chart \
  -f values.yaml \
  -f values-staging.yaml \
  --namespace staging

# 프로덕션 배포
helm upgrade --install my-app ./my-app-chart \
  -f values.yaml \
  -f values-prod.yaml \
  --namespace production

9. 실전 배포 예제 - 웹앱 + DB + Redis

전체 구성

웹 애플리케이션, PostgreSQL 데이터베이스, Redis 캐시를 Kubernetes에 배포하는 전체 예제입니다.

네임스페이스 생성

kubectl create namespace my-app

PostgreSQL 배포

# postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16-alpine
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              value: mydb
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: username
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: password
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
      volumes:
        - name: postgres-data
          persistentVolumeClaim:
            claimName: postgres-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: my-app
spec:
  selector:
    app: postgres
  ports:
    - port: 5432
      targetPort: 5432

Redis 배포

# redis-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
        - name: redis
          image: redis:7-alpine
          ports:
            - containerPort: 6379
          resources:
            requests:
              memory: "64Mi"
              cpu: "100m"
            limits:
              memory: "128Mi"
              cpu: "250m"
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: my-app
spec:
  selector:
    app: redis
  ports:
    - port: 6379
      targetPort: 6379

웹 애플리케이션 배포

# app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-app
          image: my-web-app:1.0
          ports:
            - containerPort: 3000
          env:
            - name: DATABASE_URL
              value: "postgres://$(DB_USER):$(DB_PASS)@postgres:5432/mydb"
            - name: REDIS_URL
              value: "redis://redis:6379"
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: username
            - name: DB_PASS
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: password
          livenessProbe:
            httpGet:
              path: /healthz
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
  name: web-app
  namespace: my-app
spec:
  type: ClusterIP
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app-ingress
  namespace: my-app
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-app
                port:
                  number: 80

배포 순서

# 1. Secret 생성
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secure-password-here \
  -n my-app

# 2. PVC 생성
kubectl apply -f postgres-pvc.yaml

# 3. 데이터베이스 배포
kubectl apply -f postgres-deployment.yaml

# 4. Redis 배포
kubectl apply -f redis-deployment.yaml

# 5. 웹 앱 배포
kubectl apply -f app-deployment.yaml

# 6. 상태 확인
kubectl get all -n my-app

10. 트러블슈팅

CrashLoopBackOff

Pod가 반복적으로 시작과 크래시를 반복하는 상태입니다.

# 원인 확인
kubectl describe pod POD_NAME -n my-app
kubectl logs POD_NAME -n my-app --previous

# 흔한 원인들:
# - 애플리케이션 시작 시 에러 발생
# - 잘못된 환경 변수
# - 의존하는 서비스에 연결 실패
# - livenessProbe 실패

해결 방법: 로그를 확인하여 에러를 파악합니다. livenessProbe의 initialDelaySeconds를 늘리거나, 환경 변수가 올바른지 확인합니다.

ImagePullBackOff

컨테이너 이미지를 가져올 수 없는 상태입니다.

# 원인 확인
kubectl describe pod POD_NAME -n my-app

# 흔한 원인들:
# - 이미지 이름이나 태그 오타
# - 프라이빗 레지스트리 인증 미설정
# - 이미지가 존재하지 않음

# 프라이빗 레지스트리 인증 설정
kubectl create secret docker-registry regcred \
  --docker-server=registry.example.com \
  --docker-username=user \
  --docker-password=pass \
  -n my-app

OOMKilled

컨테이너가 메모리 제한을 초과하여 강제 종료된 상태입니다.

# 원인 확인
kubectl describe pod POD_NAME -n my-app

# 실시간 리소스 사용량 확인
kubectl top pod -n my-app

# 해결: 메모리 limits 값을 늘림
resources:
  limits:
    memory: "512Mi"  # 기존 256Mi에서 증가

일반적인 디버깅 명령어

# Pod 목록과 상태 확인
kubectl get pods -n my-app -o wide

# Pod 상세 정보 (이벤트 포함)
kubectl describe pod POD_NAME -n my-app

# 실시간 로그 확인
kubectl logs -f POD_NAME -n my-app

# Pod 내부 접속
kubectl exec -it POD_NAME -n my-app -- /bin/sh

# Service 엔드포인트 확인
kubectl get endpoints -n my-app

# 클러스터 내부에서 DNS 확인
kubectl run debug --rm -it --image=busybox -- nslookup web-app.my-app.svc.cluster.local

# 노드 리소스 현황
kubectl top nodes

마무리

이 글에서 다룬 내용을 정리하겠습니다.

  1. 컨테이너 기초: 네임스페이스와 cgroups로 경량 격리를 구현
  2. Docker: 이미지 빌드, 멀티스테이지 빌드, 보안 관행
  3. Docker Compose: 개발 환경에서의 멀티 컨테이너 관리
  4. K8s 아키텍처: Control Plane과 Worker Node의 역할
  5. K8s 오브젝트: Pod, Deployment, Service, Ingress의 활용
  6. 설정 관리: ConfigMap, Secret, PV/PVC
  7. Helm: 차트를 이용한 릴리즈 관리
  8. 실전 배포: 웹앱 + DB + Redis 3-tier 아키텍처
  9. 트러블슈팅: CrashLoopBackOff, ImagePullBackOff, OOMKilled 대응

Docker와 Kubernetes는 현대 인프라의 기초입니다. 이 글의 예제를 로컬 환경(minikube 또는 kind)에서 직접 실습해 보시기 바랍니다. 실제로 Pod를 배포하고 Service를 구성하고 트러블슈팅해 보는 경험이 가장 빠른 학습 방법입니다.

Docker & Kubernetes Essentials — From Containers to Orchestration

1. What Are Containers

Virtual Machines vs Containers

Traditional virtual machines (VMs) run an entire guest OS on top of a hypervisor. Each VM includes its own kernel, libraries, and binaries, requiring several GB of disk space and tens of seconds to boot.

Containers share the host OS kernel and provide process-level isolation. Image sizes are typically tens of MB, and startup time is measured in milliseconds.

AspectVirtual MachineContainer
Isolation levelFull OSProcess
Image sizeSeveral GBTens to hundreds of MB
Startup timeTens of secondsMilliseconds
Resource overheadHighLow
PortabilityHypervisor dependentRuns anywhere with matching kernel

Linux Namespaces and cgroups

Container isolation is implemented through two Linux kernel features.

Namespaces limit the scope of system resources that a process can see.

  • PID namespace: The container sees an independent process tree starting from PID 1
  • NET namespace: Independent network interfaces, IP addresses, and routing tables
  • MNT namespace: Independent filesystem mount points
  • UTS namespace: Independent hostname
  • IPC namespace: Independent IPC resources
  • USER namespace: Independent UID/GID mappings

cgroups (Control Groups) limit hardware resource usage such as CPU, memory, and disk I/O.

# Check memory limit via cgroup
cat /sys/fs/cgroup/memory/docker/CONTAINER_ID/memory.limit_in_bytes

Thanks to these two features, containers behave like independent machines while sharing the kernel, keeping them lightweight.


2. Docker Fundamentals

Three Core Concepts

Image is a read-only template. Application code, runtime, system libraries, and configuration files are stored in a layered structure.

Container is a running instance of an image. A writable layer is added on top of the image.

Registry is a service for storing and distributing images. Docker Hub is the most common, with private registries like AWS ECR and GitHub Container Registry also available.

Essential Commands

# Pull an image
docker pull nginx:1.25

# Run a container
docker run -d --name my-nginx -p 8080:80 nginx:1.25

# List running containers
docker ps

# View container logs
docker logs my-nginx

# Access container shell
docker exec -it my-nginx /bin/bash

# Stop and remove container
docker stop my-nginx
docker rm my-nginx

# Build an image
docker build -t my-app:1.0 .

# Push image to registry
docker tag my-app:1.0 registry.example.com/my-app:1.0
docker push registry.example.com/my-app:1.0

Understanding Image Layers

Docker images consist of multiple read-only layers. Each instruction in a Dockerfile creates one layer.

# Inspect image layers
docker history my-app:1.0

Layers are cached. Unchanged layers are not rebuilt, so placing less frequently changed instructions at the top of your Dockerfile is key to optimizing build speed.


3. Dockerfile Best Practices

Basic Dockerfile Structure

# Specify base image
FROM node:20-alpine

# Set working directory
WORKDIR /app

# Copy dependency files first (cache optimization)
COPY package.json package-lock.json ./
RUN npm ci --only=production

# Copy source code
COPY . .

# Expose port
EXPOSE 3000

# Run command
CMD ["node", "server.js"]

Multi-stage Builds

Separate build tools from the runtime environment to dramatically reduce the final image size.

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]

devDependencies, source code, and build tools needed only during the build stage are excluded from the final image.

Layer Cache Optimization

# Bad: npm install re-runs every time source changes
COPY . .
RUN npm ci

# Good: leverages cache when package.json hasn't changed
COPY package.json package-lock.json ./
RUN npm ci
COPY . .

.dockerignore File

Prevent unnecessary files from being included in the build context.

node_modules
.git
.env
*.md
dist
.DS_Store
coverage

Security Best Practices

# 1. Run as non-root user
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

# 2. Use specific base image versions (never use latest tag)
FROM node:20.11.1-alpine3.19

# 3. Use COPY instead of ADD
COPY ./config /app/config

# 4. Never include sensitive data in the image
# Use build ARGs carefully, ensuring they don't remain in the final image

4. Docker Compose

Multi-container Orchestration

Docker Compose defines and manages multiple containers with a single YAML file.

version: "3.9"

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    depends_on:
      db:
        condition: service_healthy
      cache:
        condition: service_started
    volumes:
      - ./src:/app/src
    networks:
      - backend

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
      POSTGRES_DB: mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend

  cache:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    networks:
      - backend

volumes:
  postgres_data:

networks:
  backend:
    driver: bridge

Core Compose Commands

# Start all services
docker compose up -d

# View logs
docker compose logs -f app

# Rebuild and restart a specific service
docker compose up -d --build app

# Check service status
docker compose ps

# Stop all services and clean up resources
docker compose down -v

Networking and Volumes

Networking: Services within the same Compose file can communicate using service names. In the example above, the app service accesses the database via db:5432.

Volumes: Data persists even when containers are deleted. The named volume postgres_data preserves database files.

Development vs Production Environments

# docker-compose.override.yml (auto-applied for development)
services:
  app:
    build:
      target: development
    volumes:
      - ./src:/app/src
    environment:
      - NODE_ENV=development
    command: npm run dev
# Production deployment without override
docker compose -f docker-compose.yml up -d

5. Kubernetes Architecture

Why Kubernetes

Docker Compose is great for managing multiple containers on a single host, but production environments require:

  • Container deployment across multiple servers
  • Auto-scaling
  • Service discovery and load balancing
  • Rolling updates and rollbacks
  • Self-healing

Kubernetes (K8s) is a container orchestration platform that provides all of this.

Control Plane Components

API Server (kube-apiserver): The central gateway that handles all cluster requests. kubectl commands, internal components, and external clients all go through the API Server.

etcd: A distributed key-value store that holds all cluster state. Information about which Pods run where, which Services exist, and more is stored here.

Scheduler (kube-scheduler): Decides which node to place newly created Pods on. It considers resource requirements, node status, affinity rules, and more.

Controller Manager (kube-controller-manager): Ensures the cluster's current state matches the desired state. Includes the ReplicaSet Controller, Node Controller, Job Controller, and others.

Worker Node Components

kubelet: Runs on each node, executing containers and reporting status as directed by the API Server.

kube-proxy: Manages network rules on each node to handle Service load balancing.

Container Runtime: The software that actually runs containers. containerd is the most widely used.

Control Plane
  +------------------+
  | API Server       |<--- kubectl, clients
  | etcd             |
  | Scheduler        |
  | Controller Mgr   |
  +------------------+
        |
  Worker Node 1          Worker Node 2
  +-----------------+   +-----------------+
  | kubelet         |   | kubelet         |
  | kube-proxy      |   | kube-proxy      |
  | containerd      |   | containerd      |
  | [Pod] [Pod]     |   | [Pod] [Pod]     |
  +-----------------+   +-----------------+

6. Kubernetes Core Objects

Pod

A Pod is the smallest deployable unit in K8s. It contains one or more containers that share the same network and storage.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
  labels:
    app: my-app
spec:
  containers:
    - name: app
      image: my-app:1.0
      ports:
        - containerPort: 3000
      resources:
        requests:
          memory: "128Mi"
          cpu: "250m"
        limits:
          memory: "256Mi"
          cpu: "500m"
      livenessProbe:
        httpGet:
          path: /healthz
          port: 3000
        initialDelaySeconds: 10
        periodSeconds: 5
      readinessProbe:
        httpGet:
          path: /ready
          port: 3000
        initialDelaySeconds: 5
        periodSeconds: 3

ReplicaSet

Ensures that a specified number of Pod replicas are always running. Typically not used directly, but managed through Deployments.

Deployment

Declaratively manages Pods and ReplicaSets. Supports rolling updates, rollbacks, and scaling.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: app
          image: my-app:1.0
          ports:
            - containerPort: 3000
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"
# Check deployment status
kubectl rollout status deployment/my-app

# Update image (triggers rolling update)
kubectl set image deployment/my-app app=my-app:2.0

# Rollback
kubectl rollout undo deployment/my-app

# Scaling
kubectl scale deployment/my-app --replicas=5

Service

Provides stable network endpoints for Pods. Pods are ephemeral, but a Service's IP and DNS remain fixed.

ClusterIP (default): Only accessible within the cluster.

apiVersion: v1
kind: Service
metadata:
  name: my-app-svc
spec:
  type: ClusterIP
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000

NodePort: Exposes the service externally through a specific port on each node.

apiVersion: v1
kind: Service
metadata:
  name: my-app-nodeport
spec:
  type: NodePort
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000
      nodePort: 30080

LoadBalancer: Automatically provisions a cloud provider's load balancer.

apiVersion: v1
kind: Service
metadata:
  name: my-app-lb
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 3000

Ingress

Routes HTTP/HTTPS traffic to internal cluster Services. You can define routing rules based on hostnames and paths.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app-svc
                port:
                  number: 80
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: api-svc
                port:
                  number: 80
  tls:
    - hosts:
        - app.example.com
      secretName: tls-secret

7. Kubernetes Configuration Management

ConfigMap

Separates environment configuration from container images.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_HOST: "db-service"
  DATABASE_PORT: "5432"
  LOG_LEVEL: "info"
  app.properties: |
    server.port=3000
    cache.ttl=300
# Using ConfigMap in a Pod
spec:
  containers:
    - name: app
      image: my-app:1.0
      envFrom:
        - configMapRef:
            name: app-config
      volumeMounts:
        - name: config-volume
          mountPath: /app/config
  volumes:
    - name: config-volume
      configMap:
        name: app-config
        items:
          - key: app.properties
            path: app.properties

Secret

Manages sensitive data like passwords and API keys. Stored as base64-encoded values.

# Create a Secret
kubectl create secret generic db-secret \
  --from-literal=username=admin \
  --from-literal=password=s3cret
apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  username: YWRtaW4=
  password: czNjcmV0
# Using Secret in a Pod
spec:
  containers:
    - name: app
      env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: username
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: password

PersistentVolume (PV) and PersistentVolumeClaim (PVC)

Maintains data independently from the Pod lifecycle.

# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard
# Using PVC in a Deployment
spec:
  containers:
    - name: postgres
      image: postgres:16-alpine
      volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
  volumes:
    - name: postgres-storage
      persistentVolumeClaim:
        claimName: postgres-pvc

8. Helm - The Kubernetes Package Manager

What Is Helm

Helm is a tool that bundles K8s manifests into packages called charts. You can install, upgrade, and rollback multiple YAML files as a single unit.

Chart Structure

my-app-chart/
  Chart.yaml          # Chart metadata
  values.yaml         # Default configuration values
  templates/          # K8s manifest templates
    deployment.yaml
    service.yaml
    ingress.yaml
    configmap.yaml
    _helpers.tpl      # Template helper functions
  charts/             # Dependency charts

values.yaml

# values.yaml
replicaCount: 3

image:
  repository: my-app
  tag: "1.0"
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  hostname: app.example.com

resources:
  requests:
    cpu: 250m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilization: 70

Core Helm Commands

# Install a chart
helm install my-release ./my-app-chart

# Install with custom values
helm install my-release ./my-app-chart -f production-values.yaml

# Upgrade a release
helm upgrade my-release ./my-app-chart --set image.tag=2.0

# List releases
helm list

# Check release status
helm status my-release

# Release history
helm history my-release

# Rollback
helm rollback my-release 1

# Uninstall release
helm uninstall my-release

Managing Environment-specific Values Files

# File structure by environment
values.yaml              # Common defaults
values-dev.yaml          # Development
values-staging.yaml      # Staging
values-prod.yaml         # Production
# Staging deployment
helm upgrade --install my-app ./my-app-chart \
  -f values.yaml \
  -f values-staging.yaml \
  --namespace staging

# Production deployment
helm upgrade --install my-app ./my-app-chart \
  -f values.yaml \
  -f values-prod.yaml \
  --namespace production

9. Real-world Deployment - Web App + DB + Redis

Overall Architecture

A complete example of deploying a web application, PostgreSQL database, and Redis cache to Kubernetes.

Create Namespace

kubectl create namespace my-app

PostgreSQL Deployment

# postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16-alpine
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              value: mydb
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: username
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: password
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
      volumes:
        - name: postgres-data
          persistentVolumeClaim:
            claimName: postgres-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: my-app
spec:
  selector:
    app: postgres
  ports:
    - port: 5432
      targetPort: 5432

Redis Deployment

# redis-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
        - name: redis
          image: redis:7-alpine
          ports:
            - containerPort: 6379
          resources:
            requests:
              memory: "64Mi"
              cpu: "100m"
            limits:
              memory: "128Mi"
              cpu: "250m"
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: my-app
spec:
  selector:
    app: redis
  ports:
    - port: 6379
      targetPort: 6379

Web Application Deployment

# app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-app
          image: my-web-app:1.0
          ports:
            - containerPort: 3000
          env:
            - name: DATABASE_URL
              value: "postgres://$(DB_USER):$(DB_PASS)@postgres:5432/mydb"
            - name: REDIS_URL
              value: "redis://redis:6379"
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: username
            - name: DB_PASS
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: password
          livenessProbe:
            httpGet:
              path: /healthz
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
  name: web-app
  namespace: my-app
spec:
  type: ClusterIP
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app-ingress
  namespace: my-app
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-app
                port:
                  number: 80

Deployment Order

# 1. Create Secret
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secure-password-here \
  -n my-app

# 2. Create PVC
kubectl apply -f postgres-pvc.yaml

# 3. Deploy database
kubectl apply -f postgres-deployment.yaml

# 4. Deploy Redis
kubectl apply -f redis-deployment.yaml

# 5. Deploy web app
kubectl apply -f app-deployment.yaml

# 6. Verify status
kubectl get all -n my-app

10. Troubleshooting

CrashLoopBackOff

The Pod is repeatedly starting and crashing.

# Identify the cause
kubectl describe pod POD_NAME -n my-app
kubectl logs POD_NAME -n my-app --previous

# Common causes:
# - Application errors during startup
# - Incorrect environment variables
# - Failed connection to dependent services
# - livenessProbe failure

Resolution: Check logs to identify the error. Increase the livenessProbe initialDelaySeconds, or verify that environment variables are correct.

ImagePullBackOff

The container image cannot be pulled.

# Identify the cause
kubectl describe pod POD_NAME -n my-app

# Common causes:
# - Typo in image name or tag
# - Private registry authentication not configured
# - Image does not exist

# Configure private registry authentication
kubectl create secret docker-registry regcred \
  --docker-server=registry.example.com \
  --docker-username=user \
  --docker-password=pass \
  -n my-app

OOMKilled

The container was forcefully terminated for exceeding its memory limit.

# Identify the cause
kubectl describe pod POD_NAME -n my-app

# Check real-time resource usage
kubectl top pod -n my-app

# Resolution: increase memory limits
resources:
  limits:
    memory: "512Mi"  # Increased from 256Mi

General Debugging Commands

# List Pods and their status
kubectl get pods -n my-app -o wide

# Detailed Pod information (including events)
kubectl describe pod POD_NAME -n my-app

# Real-time log streaming
kubectl logs -f POD_NAME -n my-app

# Access Pod shell
kubectl exec -it POD_NAME -n my-app -- /bin/sh

# Check Service endpoints
kubectl get endpoints -n my-app

# DNS resolution within the cluster
kubectl run debug --rm -it --image=busybox -- nslookup web-app.my-app.svc.cluster.local

# Node resource status
kubectl top nodes

Conclusion

Here is a summary of what we covered:

  1. Container fundamentals: Lightweight isolation through namespaces and cgroups
  2. Docker: Image building, multi-stage builds, security practices
  3. Docker Compose: Multi-container management for development environments
  4. K8s architecture: Roles of the Control Plane and Worker Nodes
  5. K8s objects: Using Pod, Deployment, Service, and Ingress
  6. Configuration management: ConfigMap, Secret, PV/PVC
  7. Helm: Release management with charts
  8. Real-world deployment: Web app + DB + Redis 3-tier architecture
  9. Troubleshooting: Handling CrashLoopBackOff, ImagePullBackOff, and OOMKilled

Docker and Kubernetes are the foundation of modern infrastructure. I encourage you to practice the examples in this guide on a local environment such as minikube or kind. Hands-on experience deploying Pods, configuring Services, and troubleshooting issues is the fastest way to learn.