Hibernating Karpenter NodePools¶

This guide covers how to hibernate Karpenter-managed NodePools using the karpenter executor.

Prerequisites¶

A K8SCluster resource configured for the target cluster
Karpenter v1 (karpenter.sh/v1) installed on the target cluster
RBAC: karpenter.sh nodepools (get, list, delete, create), v1 nodes (list, get)

Basic Setup¶

1. Create the K8SCluster Connector¶

apiVersion: hibernator.ardikabs.com/v1alpha1
kind: K8SCluster
metadata:
  name: eks-production
  namespace: hibernator-system
spec:
  providerRef:
    name: aws-production
  eks:
    name: production-cluster
    region: us-west-2

2. Create the HibernatePlan¶

apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
  name: karpenter-hibernate
  namespace: hibernator-system
spec:
  schedule:
    timezone: America/New_York
    offHours:
      - start: "20:00"
        end: "06:00"
        daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
  execution:
    strategy:
      type: Sequential
  behavior:
    mode: Strict
  targets:
    - name: karpenter-pools
      type: karpenter
      connectorRef:
        kind: K8SCluster
        name: eks-production
      parameters:
        nodePools:
          - default
          - gpu-pool
        awaitCompletion:
          enabled: true
          timeout: "5m"

Use Cases¶

Hibernate All NodePools¶

Leave nodePools empty to discover and hibernate every NodePool:

targets:
  - name: all-karpenter
    type: karpenter
    connectorRef:
      kind: K8SCluster
      name: eks-production
    parameters:
      nodePools: []  # discovers all NodePools
      awaitCompletion:
        enabled: true

Hibernate Specific NodePools¶

Target only named pools:

targets:
  - name: dev-pools
    type: karpenter
    connectorRef:
      kind: K8SCluster
      name: eks-dev
    parameters:
      nodePools:
        - batch-processing
        - dev-workloads
      awaitCompletion:
        enabled: true
        timeout: "10m"

Select NodePools by Labels¶

Use nodeSelector to dynamically discover NodePools by their Kubernetes labels, just like workloadSelector in the WorkloadScaler executor:

targets:
  - name: hibernatable-pools
    type: karpenter
    connectorRef:
      kind: K8SCluster
      name: eks-production
    parameters:
      nodeSelector:
        matchLabels:
          hibernator.ardikabs.com/enabled: "true"
        matchExpressions:
          - key: karpenter.sh/capacity-type
            operator: In
            values: ["spot", "on-demand"]
      awaitCompletion:
        enabled: true

Note

nodePools and nodeSelector are mutually exclusive — use one or the other.

Protect Critical NodePools¶

To hibernate most pools while keeping a critical one running, list only the non-critical pools explicitly:

targets:
  - name: non-critical-pools
    type: karpenter
    connectorRef:
      kind: K8SCluster
      name: eks-production
    parameters:
      nodePools:
        - batch-workers
        - dev-workloads
        # "monitoring" pool is NOT listed — stays running
      awaitCompletion:
        enabled: true

Combined EKS + Karpenter with Dependencies¶

A common pattern is to hibernate Karpenter pools before EKS managed node groups to prevent Karpenter from rescheduling pods onto managed nodes:

apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
  name: full-cluster-hibernate
  namespace: hibernator-system
spec:
  schedule:
    timezone: Asia/Jakarta
    offHours:
      - start: "20:00"
        end: "06:00"
        daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
  execution:
    strategy:
      type: DAG
      dependencies:
        - from: karpenter-pools
          to: eks-nodegroups
  targets:
    - name: karpenter-pools
      type: karpenter
      connectorRef:
        kind: K8SCluster
        name: eks-production
      parameters:
        nodePools: []
        awaitCompletion:
          enabled: true
          timeout: "5m"

    - name: eks-nodegroups
      type: eks
      connectorRef:
        kind: CloudProvider
        name: aws-production
      parameters:
        clusterName: production-cluster
        nodeGroups: []
        awaitCompletion:
          enabled: true
          timeout: "10m"

What Happens During Hibernation¶

The executor retrieves the full NodePool spec (template, limits, disruption budget, labels)
The NodePool resource is deleted from the cluster
Karpenter detects the deleted pool and begins draining nodes managed by that pool
Nodes are cordoned, pods are evicted, and underlying EC2 instances are terminated
The complete NodePool definition is stored in restore data for exact reconstruction

What Happens During Wakeup¶

The executor recreates each NodePool with the exact spec and labels from the restore data
Karpenter detects the new pool and begins provisioning nodes based on pending pod requirements
New nodes register with the cluster and pods are scheduled

Troubleshooting¶

Nodes not draining within timeout¶

Check for Pod Disruption Budgets blocking eviction
Verify Karpenter's disruption budget settings on the NodePool
Increase timeout: awaitCompletion.timeout: "15m"
Inspect Karpenter controller logs for eviction errors

NodePool recreation fails¶

Check if a NodePool with the same name already exists
Verify RBAC grants create permission for karpenter.sh nodepools
Review Karpenter webhook logs for admission errors

Wrong Karpenter API version¶

This executor uses karpenter.sh/v1. If your cluster runs an older Karpenter version using v1beta1, the executor may fail to discover or recreate pools.