Skip to content

Split View: Karpenter 실전 가이드 — Kubernetes 노드 오토스케일링의 새로운 패러다임

|

Karpenter 실전 가이드 — Kubernetes 노드 오토스케일링의 새로운 패러다임

Karpenter Autoscaling

들어가며

Kubernetes 클러스터에서 노드 오토스케일링은 비용 최적화와 서비스 안정성의 핵심입니다. 오랫동안 **Cluster Autoscaler(CA)**가 표준이었지만, AWS가 개발한 Karpenter는 근본적으로 다른 접근 방식으로 노드 프로비저닝을 혁신했습니다.

2025년 Salesforce는 1,000개 이상의 EKS 클러스터를 Cluster Autoscaler에서 Karpenter로 마이그레이션했고, 이는 Karpenter의 성숙도를 보여주는 대표적 사례입니다.

Cluster Autoscaler vs Karpenter

Cluster Autoscaler의 한계

# Cluster Autoscaler는 Node Group 단위로만 스케일링
# 미리 정의된 인스턴스 타입에 종속
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
  - name: general
    instanceType: m5.xlarge # 고정된 인스턴스 타입
    minSize: 2
    maxSize: 10
    desiredCapacity: 3

CA의 주요 한계:

  • Node Group 종속: 미리 정의된 ASG/Node Group에서만 스케일링
  • 느린 반응: 스케줄링 실패 → CA 감지 → ASG 업데이트 → EC2 프로비저닝 (수 분 소요)
  • 비효율적 Bin-packing: 노드 그룹 단위 스케일링으로 리소스 낭비
  • 수동 인스턴스 선택: 워크로드에 최적인 인스턴스를 수동으로 설정

Karpenter의 접근 방식

# Karpenter는 Pod 요구사항에 맞는 최적 인스턴스를 자동 선택
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['amd64', 'arm64']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ['4']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: '1000'
    memory: 1000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m

Karpenter의 핵심 장점:

특성Cluster AutoscalerKarpenter
스케일링 단위Node Group (ASG)개별 Pod 기반
인스턴스 선택수동 설정자동 최적화
프로비저닝 속도수 분수십 초
Bin-packing제한적최적화
Spot 처리별도 설정 필요네이티브 지원
통합(Consolidation)미지원네이티브 지원

Karpenter 아키텍처 이해

핵심 리소스

Karpenter v1(GA)은 두 가지 핵심 CRD를 사용합니다:

1. NodePool — 노드 프로비저닝 정책 정의

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu-workloads
spec:
  template:
    metadata:
      labels:
        workload-type: gpu
    spec:
      requirements:
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ['p4d', 'p5', 'g5']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand']
      taints:
        - key: nvidia.com/gpu
          effect: NoSchedule
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: gpu-nodes
  limits:
    cpu: '100'
    nvidia.com/gpu: '16'
  weight: 10 # 우선순위 (높을수록 먼저 시도)

2. EC2NodeClass — AWS 인프라 설정

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  role: 'KarpenterNodeRole-my-cluster'
  amiSelectorTerms:
    - alias: al2023@latest
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: 'my-cluster'
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: 'my-cluster'
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
        throughput: 125
        encrypted: true
  userData: |
    #!/bin/bash
    echo "Custom bootstrap script"

프로비저닝 플로우

Pod 생성 (Pending)
Karpenter Controller 감지
Pod 요구사항 분석
(CPU, Memory, GPU, Topology, Affinity)
최적 인스턴스 타입 계산
(가격, 가용성, Bin-packing 최적화)
EC2 인스턴스 직접 생성
(ASG 없이 EC2 Fleet API 사용)
NodeClaim 생성 → Node 등록
Pod 스케줄링

설치 및 구성

Helm으로 설치

# Karpenter v1.1.x 설치 (2026년 최신)
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "1.1.2" \
  --namespace kube-system \
  --set "settings.clusterName=my-cluster" \
  --set "settings.interruptionQueue=my-cluster" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

IAM 구성

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "KarpenterController",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateFleet",
        "ec2:CreateLaunchTemplate",
        "ec2:CreateTags",
        "ec2:DescribeAvailabilityZones",
        "ec2:DescribeImages",
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceTypeOfferings",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeLaunchTemplates",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSubnets",
        "ec2:DeleteLaunchTemplate",
        "ec2:RunInstances",
        "ec2:TerminateInstances",
        "iam:PassRole",
        "ssm:GetParameter",
        "pricing:GetProducts"
      ],
      "Resource": "*"
    },
    {
      "Sid": "KarpenterInterruption",
      "Effect": "Allow",
      "Action": ["sqs:DeleteMessage", "sqs:GetQueueUrl", "sqs:ReceiveMessage"],
      "Resource": "arn:aws:sqs:*:*:my-cluster"
    }
  ]
}

실전 NodePool 패턴

패턴 1: 범용 워크로드

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: general
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['amd64']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ['5']
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ['metal', '24xlarge', '16xlarge']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: '500'
    memory: 2000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

패턴 2: Spot 우선 + On-Demand 폴백

# Spot 전용 NodePool (높은 우선순위)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: spot-first
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  weight: 80 # 높은 우선순위
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 15s
---
# On-Demand 폴백 NodePool
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: on-demand-fallback
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  weight: 20 # 낮은 우선순위 (Spot 실패 시 사용)

패턴 3: ARM64 Graviton 활용

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: graviton
spec:
  template:
    metadata:
      labels:
        arch: arm64
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['arm64']
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ['c7g', 'm7g', 'r7g', 'c8g', 'm8g']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: graviton
  weight: 90 # Graviton 우선

통합(Consolidation) 전략

Karpenter의 가장 강력한 기능 중 하나는 자동 통합입니다.

통합 정책

disruption:
  # WhenEmpty: 빈 노드만 제거
  # WhenEmptyOrUnderutilized: 빈 노드 + 저활용 노드 통합
  consolidationPolicy: WhenEmptyOrUnderutilized
  consolidateAfter: 30s

  # 특정 시간에만 통합 허용
  budgets:
    - nodes: '10%' # 한 번에 최대 10% 노드만 통합
    - nodes: '0'
      schedule: '0 9 * * 1-5' # 평일 9시에는 통합 중지
      duration: 8h # 8시간 동안

통합 동작 방식

현재 상태:
┌──────────┐ ┌──────────┐ ┌──────────┐
Node A   │ │ Node B   │ │ Node CCPU: 80% │ │ CPU: 20% │ │ CPU: 15%│ m5.2xl   │ │ m5.2xl   │ │ m5.2xl   │
└──────────┘ └──────────┘ └──────────┘

통합 후:
┌──────────┐ ┌──────────┐
Node A   │ │ Node DNode B, C 제거
CPU: 80% │ │ CPU: 35% │   → 더 작은 인스턴스로 교체
│ m5.2xl   │ │ m5.xl└──────────┘ └──────────┘

Do-Not-Disrupt 어노테이션

# 중요 워크로드가 있는 Pod에 적용
apiVersion: v1
kind: Pod
metadata:
  annotations:
    karpenter.sh/do-not-disrupt: 'true'
spec:
  containers:
    - name: critical-job
      image: batch-processor:latest

Spot 인터럽션 처리

# SQS 큐를 통한 Spot 인터럽션 알림
# Karpenter가 자동으로 처리:
# 1. Spot 인터럽션 알림 수신 (2분 전)
# 2. 해당 노드의 Pod를 Cordon + Drain
# 3. 새로운 노드 프로비저닝
# 4. Pod 재스케줄링

# EventBridge Rule (CloudFormation)
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  SpotInterruptionRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source: ['aws.ec2']
        detail-type: ['EC2 Spot Instance Interruption Warning']
      Targets:
        - Arn: !GetAtt InterruptionQueue.Arn
          Id: SpotInterruptionTarget

  RebalanceRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source: ['aws.ec2']
        detail-type: ['EC2 Instance Rebalance Recommendation']
      Targets:
        - Arn: !GetAtt InterruptionQueue.Arn
          Id: RebalanceTarget

  InterruptionQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: my-cluster
      MessageRetentionPeriod: 300

모니터링

Prometheus 메트릭

# ServiceMonitor for Karpenter
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: karpenter
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: karpenter
  endpoints:
    - port: http-metrics
      interval: 30s

주요 메트릭:

# 프로비저닝된 노드 수
karpenter_nodeclaims_created_total

# 노드 프로비저닝 지연 시간
histogram_quantile(0.99, karpenter_nodeclaims_registered_duration_seconds_bucket)

# 통합으로 제거된 노드 수
karpenter_nodeclaims_disrupted_total{reason="consolidation"}

# Pending Pod 수 (Karpenter 대기)
karpenter_pods_state{state="pending"}

# 인스턴스 타입별 노드 분포
count by (instance_type) (karpenter_nodeclaims_instance_type)

CA에서 Karpenter 마이그레이션

단계별 전환

# 1단계: Karpenter 설치 (CA와 공존)
helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --namespace kube-system \
  --set "settings.clusterName=my-cluster"

# 2단계: NodePool 생성 (CA Node Group과 겹치지 않도록)
kubectl apply -f nodepool-general.yaml

# 3단계: 새 워크로드를 Karpenter로 유도
kubectl label nodes -l eks.amazonaws.com/nodegroup=old-ng \
  migration-phase=legacy

# 4단계: 기존 워크로드 점진적 이전

# 5단계: CA Node Group 축소 및 제거
eksctl scale nodegroup --cluster my-cluster \
  --name old-ng --nodes 0 --nodes-min 0

주의사항

# PodDisruptionBudget 확인
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: critical-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: critical-app
# Karpenter는 PDB를 존중하므로
# 통합 시 PDB 위반 가능성이 있으면 건너뜀

비용 최적화 팁

1. 다양한 인스턴스 타입 허용

# 나쁜 예: 인스턴스 타입 제한
requirements:
  - key: node.kubernetes.io/instance-type
    operator: In
    values: ["m5.xlarge"]  # Spot 가용성 낮음

# 좋은 예: 넓은 범위 허용
requirements:
  - key: karpenter.k8s.aws/instance-category
    operator: In
    values: ["c", "m", "r"]
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values: ["5"]

2. 적절한 리소스 요청

# Pod의 리소스 요청이 정확할수록 Bin-packing 효율 상승
resources:
  requests:
    cpu: '500m'
    memory: '512Mi'
  limits:
    cpu: '1000m'
    memory: '1Gi'

3. Topology Spread 활용

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: web

트러블슈팅

자주 발생하는 문제

# NodeClaim이 생성되지 않을 때
kubectl describe nodepool default
kubectl get events --field-selector reason=FailedProvisioning

# 인스턴스 용량 부족 (ICE)
# → 더 많은 인스턴스 타입 허용
# → 더 많은 AZ 허용

# 느린 프로비저닝
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter \
  --tail=100 | grep -i "provision"

# AMI 문제
kubectl get ec2nodeclass default -o yaml | grep -A5 amiSelector

퀴즈

Q1. Karpenter가 Cluster Autoscaler와 근본적으로 다른 점은 무엇인가요?

Karpenter는 Node Group(ASG) 없이 Pod의 요구사항을 직접 분석하여 최적의 EC2 인스턴스를 프로비저닝합니다. CA는 미리 정의된 Node Group에서만 스케일링합니다.

Q2. Karpenter v1에서 사용하는 두 가지 핵심 CRD는?

NodePool (노드 프로비저닝 정책)과 EC2NodeClass (AWS 인프라 설정)입니다.

Q3. consolidationPolicy의 WhenEmptyOrUnderutilized은 어떤 동작을 하나요?

빈 노드를 제거하고, 저활용(underutilized) 노드의 Pod를 다른 노드로 이동시킨 후 해당 노드를 제거하거나 더 작은 인스턴스로 교체합니다.

Q4. Spot 인터럽션을 Karpenter가 처리하려면 어떤 AWS 서비스가 필요한가요?

SQS 큐EventBridge Rule이 필요합니다. EC2 Spot 인터럽션 경고와 리밸런싱 권장 이벤트를 SQS로 전달하면 Karpenter가 자동 처리합니다.

Q5. karpenter.sh/do-not-disrupt 어노테이션의 용도는?

해당 Pod가 실행 중인 노드를 Karpenter의 통합(Consolidation)이나 만료(Expiration)로 인한 중단에서 보호합니다. 배치 작업이나 중요 워크로드에 사용합니다.

Q6. NodePool의 weight 필드는 어떤 역할을 하나요?

여러 NodePool이 있을 때 우선순위를 결정합니다. weight 값이 높을수록 먼저 시도됩니다. 예: Spot NodePool(weight: 80)을 On-Demand(weight: 20)보다 먼저 사용합니다.

Q7. Karpenter에서 Graviton(ARM64) 인스턴스를 활용하려면 어떻게 설정하나요?

NodePool의 requirements에서 kubernetes.io/arch: arm64와 Graviton 인스턴스 패밀리(c7g, m7g 등)를 지정합니다. 애플리케이션이 ARM64를 지원해야 합니다.

마무리

Karpenter는 Kubernetes 노드 오토스케일링의 패러다임을 바꾸고 있습니다. Node Group이라는 중간 계층을 제거하고, Pod 요구사항에 직접 반응하여 최적의 인스턴스를 프로비저닝합니다. 2026년 현재 v1 GA가 안정적으로 운영되고 있으며, Salesforce의 1,000+ 클러스터 마이그레이션 사례가 증명하듯 대규모 프로덕션 환경에서도 검증되었습니다.

참고 자료

Karpenter Practical Guide — A New Paradigm for Kubernetes Node Autoscaling

Karpenter Autoscaling

Introduction

Node autoscaling in Kubernetes clusters is central to both cost optimization and service reliability. While Cluster Autoscaler (CA) has long been the standard, Karpenter, developed by AWS, has revolutionized node provisioning with a fundamentally different approach.

In 2025, Salesforce migrated over 1,000 EKS clusters from Cluster Autoscaler to Karpenter — a landmark case that demonstrates Karpenter's maturity.

Cluster Autoscaler vs Karpenter

Limitations of Cluster Autoscaler

# Cluster Autoscaler scales only at the Node Group level
# Bound to pre-defined instance types
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
  - name: general
    instanceType: m5.xlarge # Fixed instance type
    minSize: 2
    maxSize: 10
    desiredCapacity: 3

Key limitations of CA:

  • Node Group dependency: Scales only within pre-defined ASG/Node Groups
  • Slow response: Scheduling failure -> CA detection -> ASG update -> EC2 provisioning (takes several minutes)
  • Inefficient bin-packing: Resource waste due to node group-level scaling
  • Manual instance selection: Requires manually configuring optimal instances for workloads

Karpenter's Approach

# Karpenter automatically selects optimal instances based on Pod requirements
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['amd64', 'arm64']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ['4']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: '1000'
    memory: 1000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m

Key advantages of Karpenter:

FeatureCluster AutoscalerKarpenter
Scaling unitNode Group (ASG)Per-Pod basis
Instance selectionManual configAuto-optimized
Provisioning speedMinutesTens of seconds
Bin-packingLimitedOptimized
Spot handlingSeparate setupNative support
ConsolidationNot supportedNative support

Understanding Karpenter Architecture

Core Resources

Karpenter v1 (GA) uses two core CRDs:

1. NodePool — Defines node provisioning policies

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu-workloads
spec:
  template:
    metadata:
      labels:
        workload-type: gpu
    spec:
      requirements:
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ['p4d', 'p5', 'g5']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand']
      taints:
        - key: nvidia.com/gpu
          effect: NoSchedule
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: gpu-nodes
  limits:
    cpu: '100'
    nvidia.com/gpu: '16'
  weight: 10 # Priority (higher value = tried first)

2. EC2NodeClass — AWS infrastructure configuration

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  role: 'KarpenterNodeRole-my-cluster'
  amiSelectorTerms:
    - alias: al2023@latest
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: 'my-cluster'
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: 'my-cluster'
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
        throughput: 125
        encrypted: true
  userData: |
    #!/bin/bash
    echo "Custom bootstrap script"

Provisioning Flow

Pod created (Pending)
Karpenter Controller detects
Analyzes Pod requirements
(CPU, Memory, GPU, Topology, Affinity)
Calculates optimal instance type
(Price, availability, bin-packing optimization)
Creates EC2 instance directly
(Uses EC2 Fleet API without ASG)
Creates NodeClaimRegisters Node
Pod scheduling

Installation and Configuration

Installing with Helm

# Install Karpenter v1.1.x (latest as of 2026)
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "1.1.2" \
  --namespace kube-system \
  --set "settings.clusterName=my-cluster" \
  --set "settings.interruptionQueue=my-cluster" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

IAM Configuration

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "KarpenterController",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateFleet",
        "ec2:CreateLaunchTemplate",
        "ec2:CreateTags",
        "ec2:DescribeAvailabilityZones",
        "ec2:DescribeImages",
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceTypeOfferings",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeLaunchTemplates",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSubnets",
        "ec2:DeleteLaunchTemplate",
        "ec2:RunInstances",
        "ec2:TerminateInstances",
        "iam:PassRole",
        "ssm:GetParameter",
        "pricing:GetProducts"
      ],
      "Resource": "*"
    },
    {
      "Sid": "KarpenterInterruption",
      "Effect": "Allow",
      "Action": ["sqs:DeleteMessage", "sqs:GetQueueUrl", "sqs:ReceiveMessage"],
      "Resource": "arn:aws:sqs:*:*:my-cluster"
    }
  ]
}

Practical NodePool Patterns

Pattern 1: General-Purpose Workloads

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: general
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['amd64']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ['5']
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ['metal', '24xlarge', '16xlarge']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: '500'
    memory: 2000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

Pattern 2: Spot-First with On-Demand Fallback

# Spot-only NodePool (high priority)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: spot-first
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['spot']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm', 'r']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  weight: 80 # High priority
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 15s
---
# On-Demand fallback NodePool
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: on-demand-fallback
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand']
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ['c', 'm']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  weight: 20 # Low priority (used when Spot fails)

Pattern 3: Leveraging ARM64 Graviton

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: graviton
spec:
  template:
    metadata:
      labels:
        arch: arm64
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ['arm64']
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ['c7g', 'm7g', 'r7g', 'c8g', 'm8g']
        - key: karpenter.sh/capacity-type
          operator: In
          values: ['on-demand', 'spot']
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: graviton
  weight: 90 # Graviton preferred

Consolidation Strategy

One of Karpenter's most powerful features is automatic consolidation.

Consolidation Policies

disruption:
  # WhenEmpty: Only remove empty nodes
  # WhenEmptyOrUnderutilized: Remove empty nodes + consolidate underutilized nodes
  consolidationPolicy: WhenEmptyOrUnderutilized
  consolidateAfter: 30s

  # Allow consolidation only during specific times
  budgets:
    - nodes: '10%' # Consolidate at most 10% of nodes at a time
    - nodes: '0'
      schedule: '0 9 * * 1-5' # Pause consolidation at 9 AM on weekdays
      duration: 8h # For 8 hours

How Consolidation Works

Current state:
┌──────────┐ ┌──────────┐ ┌──────────┐
Node A   │ │ Node B   │ │ Node CCPU: 80% │ │ CPU: 20% │ │ CPU: 15%│ m5.2xl   │ │ m5.2xl   │ │ m5.2xl   │
└──────────┘ └──────────┘ └──────────┘

After consolidation:
┌──────────┐ ┌──────────┐
Node A   │ │ Node DNodes B, C removed
CPU: 80% │ │ CPU: 35% │   → Replaced with smaller instance
│ m5.2xl   │ │ m5.xl└──────────┘ └──────────┘

Do-Not-Disrupt Annotation

# Apply to Pods with critical workloads
apiVersion: v1
kind: Pod
metadata:
  annotations:
    karpenter.sh/do-not-disrupt: 'true'
spec:
  containers:
    - name: critical-job
      image: batch-processor:latest

Spot Interruption Handling

# Spot interruption notifications via SQS queue
# Karpenter handles automatically:
# 1. Receives Spot interruption notification (2 minutes ahead)
# 2. Cordons + Drains Pods on the affected node
# 3. Provisions a new node
# 4. Reschedules Pods

# EventBridge Rule (CloudFormation)
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  SpotInterruptionRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source: ['aws.ec2']
        detail-type: ['EC2 Spot Instance Interruption Warning']
      Targets:
        - Arn: !GetAtt InterruptionQueue.Arn
          Id: SpotInterruptionTarget

  RebalanceRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source: ['aws.ec2']
        detail-type: ['EC2 Instance Rebalance Recommendation']
      Targets:
        - Arn: !GetAtt InterruptionQueue.Arn
          Id: RebalanceTarget

  InterruptionQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: my-cluster
      MessageRetentionPeriod: 300

Monitoring

Prometheus Metrics

# ServiceMonitor for Karpenter
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: karpenter
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: karpenter
  endpoints:
    - port: http-metrics
      interval: 30s

Key metrics:

# Total provisioned nodes
karpenter_nodeclaims_created_total

# Node provisioning latency
histogram_quantile(0.99, karpenter_nodeclaims_registered_duration_seconds_bucket)

# Nodes removed by consolidation
karpenter_nodeclaims_disrupted_total{reason="consolidation"}

# Pending Pods (waiting for Karpenter)
karpenter_pods_state{state="pending"}

# Node distribution by instance type
count by (instance_type) (karpenter_nodeclaims_instance_type)

Migrating from CA to Karpenter

Step-by-Step Transition

# Step 1: Install Karpenter (coexisting with CA)
helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --namespace kube-system \
  --set "settings.clusterName=my-cluster"

# Step 2: Create NodePool (avoid overlap with CA Node Groups)
kubectl apply -f nodepool-general.yaml

# Step 3: Direct new workloads to Karpenter
kubectl label nodes -l eks.amazonaws.com/nodegroup=old-ng \
  migration-phase=legacy

# Step 4: Gradually migrate existing workloads

# Step 5: Scale down and remove CA Node Groups
eksctl scale nodegroup --cluster my-cluster \
  --name old-ng --nodes 0 --nodes-min 0

Important Considerations

# Verify PodDisruptionBudgets
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: critical-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: critical-app
# Karpenter respects PDBs, so it will skip
# consolidation if it would violate a PDB

Cost Optimization Tips

1. Allow a Wide Range of Instance Types

# Bad example: Restricting instance types
requirements:
  - key: node.kubernetes.io/instance-type
    operator: In
    values: ["m5.xlarge"]  # Low Spot availability

# Good example: Allowing a broad range
requirements:
  - key: karpenter.k8s.aws/instance-category
    operator: In
    values: ["c", "m", "r"]
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values: ["5"]

2. Accurate Resource Requests

# The more accurate the Pod resource requests, the better the bin-packing efficiency
resources:
  requests:
    cpu: '500m'
    memory: '512Mi'
  limits:
    cpu: '1000m'
    memory: '1Gi'

3. Leverage Topology Spread

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: web

Troubleshooting

Common Issues

# When NodeClaims are not being created
kubectl describe nodepool default
kubectl get events --field-selector reason=FailedProvisioning

# Insufficient capacity errors (ICE)
# → Allow more instance types
# → Allow more Availability Zones

# Slow provisioning
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter \
  --tail=100 | grep -i "provision"

# AMI issues
kubectl get ec2nodeclass default -o yaml | grep -A5 amiSelector

Quiz

Q1. What fundamentally differentiates Karpenter from Cluster Autoscaler?

Karpenter directly analyzes Pod requirements and provisions optimal EC2 instances without relying on Node Groups (ASGs). In contrast, CA can only scale within pre-defined Node Groups.

Q2. What are the two core CRDs used in Karpenter v1?

NodePool (node provisioning policies) and EC2NodeClass (AWS infrastructure configuration).

Q3. What does the WhenEmptyOrUnderutilized consolidation policy do?

It removes empty nodes and migrates Pods from underutilized nodes to other nodes, then either removes those nodes or replaces them with smaller instances.

Q4. What AWS services are required for Karpenter to handle Spot interruptions?

An SQS queue and EventBridge Rules are required. EC2 Spot interruption warnings and rebalance recommendation events are forwarded to SQS, which Karpenter processes automatically.

Q5. What is the purpose of the karpenter.sh/do-not-disrupt annotation?

It protects the node running the annotated Pod from being disrupted by Karpenter's consolidation or expiration actions. It is used for batch jobs and critical workloads.

Q6. What role does the weight field play in a NodePool?

It determines priority when multiple NodePools exist. Higher weight values are tried first. For example, a Spot NodePool (weight: 80) is used before an On-Demand NodePool (weight: 20).

Q7. How do you configure Karpenter to use Graviton (ARM64) instances?

Specify kubernetes.io/arch: arm64 and Graviton instance families (c7g, m7g, etc.) in the NodePool requirements. The application must support ARM64.

Conclusion

Karpenter is transforming the paradigm of Kubernetes node autoscaling. By eliminating the Node Group intermediary layer, it responds directly to Pod requirements to provision optimal instances. As of 2026, v1 GA is running stably in production, and its reliability at scale has been proven by Salesforce's migration of over 1,000 clusters.

References