Hibernating EKS Node Groups¶
This guide covers how to hibernate AWS EKS managed node groups using the eks executor.
Prerequisites¶
- A
CloudProviderresource configured for your AWS account - IAM permissions:
eks:ListNodegroups,eks:DescribeNodegroup,eks:UpdateNodegroupConfig - The EKS cluster must have at least one managed node group
Basic Setup¶
1. Create the CloudProvider¶
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: CloudProvider
metadata:
name: aws-production
namespace: hibernator-system
spec:
type: aws
aws:
accountId: "123456789012"
region: us-west-2
assumeRoleArn: arn:aws:iam::123456789012:role/HibernatorRole
auth:
serviceAccount: {} # IRSA-based authentication (recommended)
2. Create the HibernatePlan¶
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
name: eks-hibernate
namespace: hibernator-system
spec:
schedule:
timezone: America/New_York
offHours:
- start: "20:00"
end: "06:00"
daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
execution:
strategy:
type: Sequential
behavior:
mode: Strict
targets:
- name: eks-nodegroups
type: eks
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
clusterName: production-cluster
nodeGroups:
- name: app-nodes
- name: worker-nodes
awaitCompletion:
enabled: true
timeout: "10m"
Use Cases¶
Hibernate All Node Groups in a Cluster¶
Leave nodeGroups empty to target every managed node group in the cluster:
targets:
- name: all-nodegroups
type: eks
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
clusterName: production-cluster
nodeGroups: [] # discovers all node groups
awaitCompletion:
enabled: true
Hibernate Specific Node Groups¶
Target only certain node groups by name:
targets:
- name: dev-nodegroups
type: eks
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
clusterName: dev-cluster
nodeGroups:
- name: app-tier
- name: worker-tier
awaitCompletion:
enabled: true
timeout: "15m"
Hibernate Multiple EKS Clusters¶
Use multiple targets within the same plan:
targets:
- name: staging-eks
type: eks
connectorRef:
kind: CloudProvider
name: aws-staging
parameters:
clusterName: staging-cluster
nodeGroups: []
- name: dev-eks
type: eks
connectorRef:
kind: CloudProvider
name: aws-dev
parameters:
clusterName: dev-cluster
nodeGroups: []
EKS with Karpenter (DAG Strategy)¶
When a cluster uses both managed node groups and Karpenter, shut down Karpenter pools first to avoid rescheduling onto managed nodes:
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
name: full-eks-hibernate
namespace: hibernator-system
spec:
schedule:
timezone: Asia/Jakarta
offHours:
- start: "20:00"
end: "06:00"
daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
execution:
strategy:
type: DAG
dependencies:
- from: karpenter-pools
to: managed-nodegroups # Karpenter first, then node groups
targets:
- name: karpenter-pools
type: karpenter
connectorRef:
kind: K8SCluster
name: eks-production
parameters:
nodePools: []
awaitCompletion:
enabled: true
- name: managed-nodegroups
type: eks
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
clusterName: production-cluster
nodeGroups: []
awaitCompletion:
enabled: true
What Happens During Hibernation¶
- Node groups are scaled to
minSize=0,desiredSize=0(maxSize stays unchanged) - AWS begins terminating nodes in the node group
- Pods running on those nodes are evicted
- The cluster API server remains available throughout
What Happens During Wakeup¶
- Node groups are restored to their original
desiredSize,minSize, andmaxSize - AWS provisions new nodes matching the node group configuration
- The Kubernetes scheduler places pods onto the new nodes
Troubleshooting¶
Nodes not scaling down¶
- Verify the IAM role has
eks:UpdateNodegroupConfigpermission - Check for Pod Disruption Budgets that prevent eviction
- Review runner logs:
kubectl logs -l hibernator.ardikabs.com/plan=eks-hibernate -n hibernator-system
Timeout during await¶
- EKS node group operations can take time depending on instance count
- Increase the timeout:
awaitCompletion.timeout: "20m" - Check the AWS Console for node group update status
Node group not found¶
- Ensure the
clusterNamematches the actual EKS cluster name - Verify node group names match (case-sensitive)
- Confirm the CloudProvider region is correct