Skip to main content

Karpenter Node Autoscaling

Karpenter replaces Cluster Autoscaler for workload nodes. It provisions right-sized EC2 instances directly via the AWS API in under 60 seconds, supports bin-packing, handles Spot interruptions gracefully, and consolidates underutilized nodes automatically.

Karpenter vs Cluster Autoscaler

FeatureKarpenterCluster Autoscaler
Provisioning speed~30–60s2–5 minutes
Instance selectionAny instance type, dynamicFixed ASG instance type
Bin-packingYes — right-sizes per pending podNo — fixed instance type
Spot interruption handlingNative, graceful drainManual setup required
Node consolidationAutomatic (WhenUnderutilized)Limited
Multi-architectureYes (arm64/amd64)Via separate ASGs

NodePool Configuration

kubernetes/karpenter/nodepool-default.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
metadata:
labels:
role: workloads
spec:
nodeClassRef:
name: default
requirements:
# Prefer On-Demand, fall back to Spot
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
# Allow a broad set of instance families for flexibility
- key: node.kubernetes.io/instance-type
operator: In
values:
- m6i.xlarge
- m6i.2xlarge
- m6i.4xlarge
- m6a.xlarge
- m6a.2xlarge
- m6a.4xlarge
- m7i.xlarge
- m7i.2xlarge
- c6i.2xlarge
- c6i.4xlarge
- key: topology.kubernetes.io/zone
operator: In
values: ["us-east-1a", "us-east-1b", "us-east-1c"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
limits:
cpu: "200"
memory: 400Gi
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
# Rotate nodes every 30 days (security best practice)
expireAfter: 720h

EC2NodeClass Configuration

kubernetes/karpenter/nodeclass-default.yaml
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: "KarpenterNodeRole-devopsgenie-production"

subnetSelectorTerms:
- tags:
karpenter.sh/discovery: devopsgenie-production

securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: devopsgenie-production

blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
iops: 3000
throughput: 125
encrypted: true
deleteOnTermination: true

# User data runs after node initialization
userData: |
#!/bin/bash
/etc/eks/bootstrap.sh devopsgenie-production \
--container-runtime containerd \
--kubelet-extra-args '--max-pods=110'

tags:
Environment: production
ManagedBy: karpenter

Spot Interruption Handling

Karpenter handles Spot interruptions automatically using SQS + EventBridge:

terraform/environments/production/karpenter-spot.tf
resource "aws_sqs_queue" "karpenter_interruption" {
name = "karpenter-interruption-devopsgenie-production"
message_retention_seconds = 300
sqs_managed_sse_enabled = true
}

resource "aws_cloudwatch_event_rule" "spot_interruption" {
name = "karpenter-spot-interruption"
description = "Spot instance interruption notification"
event_pattern = jsonencode({
source = ["aws.ec2"]
detail-type = ["EC2 Spot Instance Interruption Warning"]
})
}

resource "aws_cloudwatch_event_target" "spot_interruption" {
rule = aws_cloudwatch_event_rule.spot_interruption.name
arn = aws_sqs_queue.karpenter_interruption.arn
}

Verifying Karpenter

# Watch Karpenter provision a node
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 10
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: "1"
memory: 1Gi
EOF

# Watch nodes being provisioned
kubectl get nodes --watch

# Check Karpenter logs
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f --tail=50

# Clean up
kubectl delete deployment inflate

Useful Karpenter Commands

# List NodePools
kubectl get nodepools

# List Karpenter-managed nodes
kubectl get nodes -l karpenter.sh/nodepool

# Manually trigger consolidation
kubectl annotate nodepool default karpenter.sh/do-not-consolidate-

# View node utilization
kubectl top nodes -l karpenter.sh/nodepool=default