- Authors

- Name
- Youngju Kim
- @fjvbn20031
Table of Contents
1. What is Karpenter
Karpenter is an open-source Kubernetes node provisioner developed by AWS. Unlike Cluster Autoscaler, which manages nodes indirectly through Auto Scaling Groups (ASGs), Karpenter calls EC2 APIs directly to provision nodes optimized for your workloads within seconds.
Key Features
- Direct EC2 Provisioning: Calls EC2 Fleet API directly without ASGs
- Fast Scaling: Nodes come online in approximately 45-60 seconds (vs. 3-4 minutes with CA)
- Intelligent Instance Selection: Automatically selects optimal instance types matching workload requirements
- Bin-Packing Optimization: Advanced bin-packing algorithms maximize cluster utilization
- Auto Consolidation: Automatically removes unused nodes and replaces with lower-cost alternatives
- Drift Detection: Automatically detects and replaces out-of-date nodes when configurations change
2. Why Karpenter: Limitations of Cluster Autoscaler
Problems with Cluster Autoscaler
+------------------------------------------+
| Cluster Autoscaler |
| |
| Pod Pending |
| | |
| v |
| CA requests scale-out from ASG |
| | |
| v |
| ASG launches EC2 instance using |
| pre-defined Launch Template |
| | |
| v |
| Node registration takes 3-5 minutes |
+------------------------------------------+
Cluster Autoscaler has the following limitations:
- ASG Dependency: Tied to pre-defined Node Groups, reducing flexibility
- Slow Scaling: 3-5 minute provisioning time due to ASG intermediary
- Limited Instance Types: Only fixed instance types per Node Group
- Inefficient Bin-Packing: Scales at the Node Group level, causing resource waste
- Manual Management Overhead: Multiple Node Groups needed for diverse workloads
Karpenter's Approach
+------------------------------------------+
| Karpenter |
| |
| Pod Pending |
| | |
| v |
| Karpenter analyzes pod requirements |
| (CPU, Memory, GPU, Topology, etc.) |
| | |
| v |
| Auto-selects optimal instance type |
| (Cost optimization from 200+ types) |
| | |
| v |
| Calls EC2 Fleet API directly |
| | |
| v |
| Node Ready within 45-60 seconds |
+------------------------------------------+
3. Karpenter Architecture
Overall Structure
+----------------------------------------------------------------+
| EKS Cluster |
| |
| +------------------+ +-----------------------------+ |
| | Karpenter | | Kubernetes API Server | |
| | Controller |---->| (Pod Watch, Node Mgmt) | |
| | (Deployment) | +-----------------------------+ |
| +--------+---------+ |
| | |
| | References NodePool + EC2NodeClass |
| | |
| +--------v---------+ +-----------------------------+ |
| | Instance Type | | AWS Services | |
| | Selection Engine |---->| - EC2 Fleet API | |
| | (Cost/Capacity | | - SSM (AMI Discovery) | |
| | Optimization) | | - Pricing API | |
| +------------------+ | - SQS (Interruption) | |
| | - EventBridge | |
| +-----------------------------+ |
+----------------------------------------------------------------+
Core Components
Karpenter uses three primary Custom Resource Definitions (CRDs):
| CRD | API Version | Description |
|---|---|---|
| NodePool | karpenter.sh/v1 | Defines node provisioning constraints |
| EC2NodeClass | karpenter.k8s.aws/v1 | AWS-specific instance settings |
| NodeClaim | karpenter.sh/v1 | Runtime node request object |
Provisioning Flow
1. Pod detected in Pending state
|
2. Karpenter analyzes pod resource requests,
nodeSelector, affinity, tolerations, etc.
|
3. Determines matching NodePool (weight-based priority)
|
4. References AWS settings from EC2NodeClass
(subnets, security groups, AMIs, etc.)
|
5. Selects optimal instance type
(based on cost, capacity, and requirements)
|
6. Launches instance via EC2 Fleet API
|
7. Creates and tracks NodeClaim object
|
8. Node registration complete -> Pod scheduled
4. NodePool Configuration in Detail
NodePool is the core CRD in Karpenter v1 that replaces the legacy Provisioner.
Basic NodePool Example
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
metadata:
labels:
team: platform
environment: production
spec:
requirements:
# Instance category constraint
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['c', 'm', 'r']
# Instance generation constraint
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ['5']
# Capacity type (on-demand or spot)
- key: karpenter.sh/capacity-type
operator: In
values: ['on-demand', 'spot']
# Availability zones
- key: topology.kubernetes.io/zone
operator: In
values: ['us-east-1a', 'us-east-1b', 'us-east-1c']
# Architecture
- key: kubernetes.io/arch
operator: In
values: ['amd64', 'arm64']
# EC2NodeClass reference
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
# Node expiration (auto-replace after 72 hours)
expireAfter: 72h
# Resource limits
limits:
cpu: '1000'
memory: 1000Gi
# Disruption policy
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
budgets:
- nodes: '10%'
- nodes: '0'
schedule: '0 9 * * MON-FRI'
duration: 1h
# NodePool weight (higher = higher priority)
weight: 50
Key Requirement Keys
+---------------------------------------------+----------------------------------+
| Key | Description |
+---------------------------------------------+----------------------------------+
| karpenter.sh/capacity-type | on-demand or spot |
| karpenter.k8s.aws/instance-category | Instance family (c, m, r, etc.) |
| karpenter.k8s.aws/instance-generation | Instance generation (5, 6, 7) |
| karpenter.k8s.aws/instance-size | Instance size (large, xlarge) |
| karpenter.k8s.aws/instance-gpu-count | GPU count |
| karpenter.k8s.aws/instance-gpu-name | GPU name (a10g, t4, etc.) |
| topology.kubernetes.io/zone | Availability zone |
| kubernetes.io/arch | CPU architecture |
| kubernetes.io/os | Operating system |
+---------------------------------------------+----------------------------------+
Multi-NodePool Strategy
# Production workloads (On-Demand only, high priority)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: production
spec:
template:
metadata:
labels:
workload-type: production
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['on-demand']
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['m', 'r']
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ['5']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: production
expireAfter: 168h
limits:
cpu: '500'
memory: 500Gi
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 5m
weight: 100
---
# Dev/Test workloads (Spot allowed, low priority)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: development
spec:
template:
metadata:
labels:
workload-type: development
annotations:
dev-team: 'true'
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['spot', 'on-demand']
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['c', 'm']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: development
expireAfter: 24h
limits:
cpu: '200'
memory: 200Gi
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
weight: 10
5. EC2NodeClass Configuration in Detail
EC2NodeClass is the CRD that defines AWS-specific instance settings.
Complete EC2NodeClass Example
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: default
spec:
# AMI configuration
amiSelectorTerms:
- alias: al2023@latest
# IAM role
role: KarpenterNodeRole-my-cluster
# Subnet selection
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
network-type: private
# Security group selection
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
# Block device mappings
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
iops: 3000
throughput: 125
encrypted: true
deleteOnTermination: true
# Metadata options
metadataOptions:
httpEndpoint: enabled
httpProtocolIPv6: disabled
httpPutResponseHopLimit: 2
httpTokens: required
# Tags
tags:
Environment: production
ManagedBy: karpenter
Team: platform
# User data (bootstrap script)
userData: |
#!/bin/bash
echo "Karpenter managed node"
# Additional bootstrap logic
AMI Selection Options
# Option 1: Alias (recommended)
amiSelectorTerms:
- alias: al2023@latest # Amazon Linux 2023 latest
# - alias: al2@latest # Amazon Linux 2
# - alias: bottlerocket@latest # Bottlerocket
# Option 2: Tag-based selection
amiSelectorTerms:
- tags:
environment: production
ami-type: custom-al2023
# Option 3: Direct AMI ID
amiSelectorTerms:
- id: ami-0123456789abcdef0
Supported AMI Families
+-----------------+------------------------------------------+
| AMI Family | Description |
+-----------------+------------------------------------------+
| AL2023 | Amazon Linux 2023 (recommended) |
| AL2 | Amazon Linux 2 |
| Bottlerocket | AWS Bottlerocket (container-optimized) |
| Windows2019 | Windows Server 2019 |
| Windows2022 | Windows Server 2022 |
| Windows2025 | Windows Server 2025 |
+-----------------+------------------------------------------+
6. Consolidation: The Core of Cost Optimization
Karpenter's Consolidation automatically optimizes cluster costs by reducing unnecessary resources.
How Consolidation Works
Consolidation Types:
+------------------------------------------------------------------+
| |
| 1. Delete Consolidation |
| - When all pods on a node can run on other existing nodes |
| - Safely removes the node |
| |
| 2. Replace Consolidation |
| - When the current node can be replaced with a smaller, |
| cheaper instance |
| - Provisions new node -> Migrates pods -> Removes old node |
| |
+------------------------------------------------------------------+
Consolidation Policy Settings
# Policy 1: Consolidate only empty nodes
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 30s
# Policy 2: Consolidate empty + underutilized nodes (recommended)
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
Rate Limiting with Disruption Budgets
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
budgets:
# Allow disruption of up to 10% of total nodes simultaneously
- nodes: '10%'
# Block disruptions during business hours
- nodes: '0'
schedule: '0 9 * * MON-FRI'
duration: 8h
# Apply budget only for specific reasons (v1.0+)
- nodes: '5%'
reasons:
- 'Underutilized'
Spot-to-Spot Consolidation
Spot-to-Spot consolidation requires at least 15 instance types to be configured.
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: spot-optimized
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['spot']
# Diverse instance types to enable Spot-to-Spot consolidation
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['c', 'm', 'r']
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ['4']
- key: karpenter.k8s.aws/instance-size
operator: In
values: ['large', 'xlarge', '2xlarge', '4xlarge']
7. Drift Detection
Drift detection identifies when existing nodes no longer match the current NodePool or EC2NodeClass configuration and automatically replaces them.
Scenarios That Trigger Drift
+---------------------------------------------------+
| Drift Detection Scenarios |
+---------------------------------------------------+
| - AMI has been updated |
| - NodePool requirements have changed |
| - EC2NodeClass security groups have changed |
| - EC2NodeClass subnets have changed |
| - Block device settings have changed |
| - Metadata options have changed |
| - Tags have changed |
+---------------------------------------------------+
Drift Replacement Process
1. Karpenter compares NodeClaim with current NodePool/EC2NodeClass
|
2. If differences found, marks NodeClaim as Drifted
|
3. Checks Disruption Budget
|
4. Provisions new node (with latest settings)
|
5. Safely drains pods from old node
|
6. Terminates old node
8. Interruption Handling
Karpenter automatically handles various EC2 interruption events.
Supported Interruption Types
+-----------------------------+------------------------------------------+
| Interruption Type | Description |
+-----------------------------+------------------------------------------+
| Spot Interruption | 2-minute warning for Spot reclamation |
| Rebalance Recommendation | Pre-alert when disruption risk increases |
| Scheduled Maintenance | AWS scheduled maintenance events |
| Instance State Change | State changes (stopping, stopped) |
+-----------------------------+------------------------------------------+
SQS-Based Interruption Handling Architecture
+-------------------+ +-------------------+ +------------------+
| EC2 Spot | | Amazon | | Amazon |
| Interruption |---->| EventBridge |---->| SQS Queue |
| Notice | | Rules | | |
+-------------------+ +-------------------+ +--------+---------+
|
+-------------------+ +-------------------+ |
| EC2 Rebalance |---->| EventBridge |----+ |
| Recommendation | | | | |
+-------------------+ +-------------------+ | |
v v
+----+---------+----+
| Karpenter |
| Controller |
| |
| 1. Receive event |
| 2. Cordon node |
| 3. Drain pods |
| 4. Launch new node|
| 5. Terminate old |
+-------------------+
Interruption Handling Configuration
Specify the SQS queue name during Helm installation:
helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version "1.0.0" \
--namespace karpenter \
--create-namespace \
--set "settings.clusterName=my-cluster" \
--set "settings.interruptionQueue=my-cluster-karpenter" \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi
9. Spot Instance Best Practices
Diversified Instance Types
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: spot-diverse
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['spot']
# Diverse instance families for better Spot availability
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['c', 'm', 'r']
# Multiple generations
- key: karpenter.k8s.aws/instance-generation
operator: In
values: ['5', '6', '7']
# Various sizes
- key: karpenter.k8s.aws/instance-size
operator: In
values: ['large', 'xlarge', '2xlarge', '4xlarge']
# Multiple availability zones
- key: topology.kubernetes.io/zone
operator: In
values: ['us-east-1a', 'us-east-1b', 'us-east-1c']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
Spot + On-Demand Mixed Strategy
# Spot-first NodePool (high weight)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: spot-first
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['spot']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
weight: 100
limits:
cpu: '500'
---
# On-Demand fallback NodePool (low weight)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: on-demand-fallback
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ['on-demand']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
weight: 1
limits:
cpu: '200'
Spot Usage Best Practices
- Diversify instance types: Allow at least 15 instance types to maximize Spot availability
- Use multiple availability zones: Avoid single-AZ dependency for Spot capacity
- Set PDB (Pod Disruption Budget): Ensure minimum availability for critical workloads
- Handle graceful shutdown: Set appropriate terminationGracePeriodSeconds
10. Installing Karpenter with Helm
Prerequisites
# Set environment variables
export KARPENTER_NAMESPACE="karpenter"
export KARPENTER_VERSION="1.0.0"
export CLUSTER_NAME="my-eks-cluster"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT="$(mktemp)"
IAM Role Creation
# Create Karpenter controller role
aws iam create-role \
--role-name "KarpenterControllerRole-my-cluster" \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::oidc-provider/oidc.eks.REGION.amazonaws.com/id/EXAMPLE"
},
"Action": "sts:AssumeRoleWithWebIdentity"
}]
}'
# Create Karpenter node role
aws iam create-role \
--role-name "KarpenterNodeRole-my-cluster" \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}'
Helm Installation
# Install via OCI registry
helm install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version "1.0.0" \
--namespace karpenter \
--create-namespace \
--set "settings.clusterName=my-cluster" \
--set "settings.interruptionQueue=my-cluster-karpenter" \
--set "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn=arn:aws:iam::123456789012:role/KarpenterControllerRole-my-cluster" \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi \
--wait
Verify Installation
# Check Karpenter pod status
kubectl get pods -n karpenter
# Verify CRDs
kubectl get crd | grep karpenter
# Check logs
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
11. Complete Deployment Example
Full Configuration (NodePool + EC2NodeClass)
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: production
spec:
amiSelectorTerms:
- alias: al2023@latest
role: KarpenterNodeRole-my-cluster
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
network-type: private
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
iops: 3000
throughput: 125
encrypted: true
deleteOnTermination: true
metadataOptions:
httpEndpoint: enabled
httpPutResponseHopLimit: 2
httpTokens: required
tags:
Environment: production
ManagedBy: karpenter
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: production
spec:
template:
metadata:
labels:
environment: production
managed-by: karpenter
spec:
requirements:
- key: karpenter.k8s.aws/instance-category
operator: In
values: ['c', 'm', 'r']
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ['5']
- key: karpenter.sh/capacity-type
operator: In
values: ['on-demand']
- key: topology.kubernetes.io/zone
operator: In
values: ['us-east-1a', 'us-east-1b', 'us-east-1c']
- key: kubernetes.io/arch
operator: In
values: ['amd64']
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: production
expireAfter: 168h
limits:
cpu: '1000'
memory: 2000Gi
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 2m
budgets:
- nodes: '10%'
- nodes: '0'
schedule: '0 2 * * *'
duration: 1h
weight: 100
Test Workload Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
namespace: default
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: '1'
memory: 1Gi
# Scale up to trigger Karpenter provisioning
kubectl scale deployment inflate --replicas=10
# Watch node provisioning
kubectl get nodes -w
# Check NodeClaim status
kubectl get nodeclaims
# Scale down to observe consolidation
kubectl scale deployment inflate --replicas=0
12. Karpenter vs Cluster Autoscaler Comparison
+-----------------------------+----------------------------+----------------------------+
| Feature | Karpenter | Cluster Autoscaler |
+-----------------------------+----------------------------+----------------------------+
| Provisioning Method | Direct EC2 API calls | Indirect via ASG |
| Provisioning Speed | 45-60 seconds | 3-5 minutes |
| Instance Selection | Auto-optimized (200+ types)| Fixed per Node Group |
| Bin Packing | Pod-level optimization | Node Group level |
| Consolidation | Built-in (automatic) | Limited (scale-down only) |
| Drift Detection | Automatic | Not supported |
| Spot Instances | Native, auto-diversified | ASG Mixed Instances |
| Spot Interruption Handling | SQS-based automatic | Separate tools needed |
| Multi-Architecture | Auto AMD64 + ARM64 | Separate Node Groups |
| Configuration Complexity | NodePool + EC2NodeClass | ASG + Launch Template |
| Multi-Cloud | AWS only (community ext.) | Official multi-cloud |
| Cost Savings | 25-40% (bin-pack + Spot) | 10-20% |
| Official K8s Project | No (AWS-led) | Yes (SIG Autoscaling) |
+-----------------------------+----------------------------+----------------------------+
13. Common Issues and Troubleshooting
Pods Stuck in Pending State
# Check NodePool requirements
kubectl describe nodepool default
# Check Karpenter logs for provisioning failure reasons
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter | grep -i "error\|failed"
# Check pod events
kubectl describe pod my-pod-name
Common Causes and Solutions
- Missing subnet or security group tags: Verify EC2NodeClass selector tags match actual AWS resources
- Insufficient IAM permissions: Check Karpenter controller role has required EC2, IAM, SSM permissions
- Resource limits exceeded: Verify NodePool limits are not exceeded by current usage
- Instance type unavailability: Certain instance types may have insufficient capacity in specific AZs
14. Summary
Karpenter is transforming the paradigm of Kubernetes node provisioning. Moving away from static ASG-based node management, it enables dynamic, workload-centric infrastructure provisioning.
Recommended Adoption Scenarios
- Workloads requiring diverse instance types
- Environments actively leveraging Spot instances
- Event-driven workloads needing rapid scaling
- Clusters where cost optimization is a primary goal
- Environments including GPU/ML workloads
Important Considerations
- AWS EKS exclusive; consider Cluster Autoscaler alongside for multi-cloud environments
- Recommended to use v1.0+ stable version
- Always enable Spot interruption handling by configuring the SQS queue
- Set appropriate NodePool limits to prevent cost overruns