Hibernating RDS Databases¶
This guide covers how to hibernate AWS RDS database instances and Aurora clusters using the rds executor.
Prerequisites¶
- A
CloudProviderresource configured for your AWS account - IAM permissions:
rds:DescribeDBInstances,rds:DescribeDBClusters,rds:StopDBInstance,rds:StartDBInstance,rds:StopDBCluster,rds:StartDBCluster - If using snapshots:
rds:CreateDBSnapshotorrds:CreateDBClusterSnapshot
Basic Setup¶
1. Create the CloudProvider¶
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: CloudProvider
metadata:
name: aws-production
namespace: hibernator-system
spec:
type: aws
aws:
accountId: "123456789012"
region: us-west-2
assumeRoleArn: arn:aws:iam::123456789012:role/HibernatorRole
auth:
serviceAccount: {}
2. Create the HibernatePlan¶
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
name: rds-hibernate
namespace: hibernator-system
spec:
schedule:
timezone: America/New_York
offHours:
- start: "20:00"
end: "06:00"
daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
execution:
strategy:
type: Sequential
behavior:
mode: Strict
targets:
- name: staging-db
type: rds
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
selector:
instanceIds:
- staging-db-01
snapshotBeforeStop: false
awaitCompletion:
enabled: true
timeout: "15m"
Understanding RDS Selectors¶
The RDS executor has three mutually exclusive selection modes. Choosing the right one depends on your use case.
Mode 1: Explicit IDs¶
Target specific database instances and/or clusters by their identifiers. This is the simplest and most predictable mode.
parameters:
selector:
instanceIds:
- my-db-instance-1
- my-db-instance-2
clusterIds:
- my-aurora-cluster
Resource types are inferred automatically:
instanceIdspresent → discovers DB instancesclusterIdspresent → discovers DB clusters- Both present → discovers both
Mode 2: Tag-Based Selection¶
Target databases by their AWS resource tags. Requires explicit opt-in via discovery flags.
parameters:
selector:
tags:
Environment: staging
Team: backend
discoverInstances: true # opt-in to discover DB instances
discoverClusters: false # do not discover DB clusters
Discovery flags are required
Setting tags alone without discoverInstances or discoverClusters is a no-op — nothing will be discovered. This is a safety measure to prevent accidentally targeting more resources than intended.
Use excludeTags to match everything except certain tags:
Note
tags and excludeTags are mutually exclusive — you cannot use both in the same selector.
Mode 2b: Expression-Based Tag Selection¶
For more flexible matching, use tagSelector with operators and glob patterns:
parameters:
selector:
tagSelector:
matchTags:
Environment: "staging-*"
matchExpressions:
- key: Team
operator: In
values: ["backend", "frontend"]
- key: Critical
operator: DoesNotExist
discoverInstances: true
discoverClusters: false
Supported operators:
| Operator | Behavior |
|---|---|
In |
Tag value is in the provided list |
NotIn |
Tag value is NOT in the provided list |
Exists |
Tag key exists (any value) |
DoesNotExist |
Tag key does not exist |
Matches |
Tag value matches a glob pattern (*, ?) |
NotMatches |
Tag value does NOT match any glob pattern |
Note
tagSelector is mutually exclusive with tags and excludeTags.
Migrating from excludeTags to tagSelector¶
Old (excludeTags) |
New (tagSelector) |
|---|---|
excludeTags: {Critical: "true"} |
matchExpressions: [{key: Critical, operator: DoesNotExist}] |
excludeTags: {Env: "prod"} |
matchExpressions: [{key: Env, operator: NotIn, values: ["prod"]}] |
Mode 3: Include All¶
Discover all databases in the account and region:
Danger
Use includeAll with caution in production accounts. It will target every RDS instance and cluster visible to the IAM role in the configured region.
Use Cases¶
Stop a Single Production Database with Snapshot¶
targets:
- name: prod-db
type: rds
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
selector:
instanceIds:
- production-db-primary
snapshotBeforeStop: true
awaitCompletion:
enabled: true
timeout: "20m"
The executor creates a snapshot named production-db-primary-hibernate-{timestamp} and waits for it to complete before stopping the instance.
Hibernate All Staging Databases by Tag¶
targets:
- name: staging-databases
type: rds
connectorRef:
kind: CloudProvider
name: aws-staging
parameters:
selector:
tags:
Environment: staging
discoverInstances: true
discoverClusters: true
snapshotBeforeStop: false
awaitCompletion:
enabled: true
Hibernate Aurora Clusters¶
targets:
- name: aurora-clusters
type: rds
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
selector:
clusterIds:
- aurora-staging
- aurora-dev
snapshotBeforeStop: true
awaitCompletion:
enabled: true
timeout: "20m"
Hibernate Everything Except Critical Databases¶
targets:
- name: non-critical-dbs
type: rds
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
selector:
excludeTags:
Critical: "true"
discoverInstances: true
discoverClusters: true
snapshotBeforeStop: false
awaitCompletion:
enabled: true
Full Stack: Apps → Database (DAG Order)¶
Ensure application servers are stopped before the database:
apiVersion: hibernator.ardikabs.com/v1alpha1
kind: HibernatePlan
metadata:
name: full-stack-hibernate
namespace: hibernator-system
spec:
schedule:
timezone: America/New_York
offHours:
- start: "20:00"
end: "06:00"
daysOfWeek: ["MON", "TUE", "WED", "THU", "FRI"]
execution:
strategy:
type: DAG
dependencies:
- from: app-workloads
to: database
behavior:
mode: BestEffort
retries: 3
targets:
- name: app-workloads
type: workloadscaler
connectorRef:
kind: K8SCluster
name: eks-production
parameters:
namespace:
literals: [default]
workloadSelector:
matchLabels:
tier: application
awaitCompletion:
enabled: true
- name: database
type: rds
connectorRef:
kind: CloudProvider
name: aws-production
parameters:
selector:
instanceIds:
- production-db
snapshotBeforeStop: true
awaitCompletion:
enabled: true
timeout: "15m"
What Happens During Hibernation¶
- The executor discovers databases based on the selector mode
- Only databases with status
availableare eligible for stopping - If
snapshotBeforeStopis enabled, a snapshot is created and the executor waits for it to complete (up to 30 minutes) - The database is stopped via
StopDBInstanceorStopDBCluster - State is saved: instance/cluster ID, previous status, snapshot ID, instance type
What Happens During Wakeup¶
- Databases that were running before hibernation are started via
StartDBInstanceorStartDBCluster - Databases that were already stopped before hibernation remain stopped
- The executor polls until databases return to
availablestatus
Important Considerations¶
AWS 7-day auto-restart
AWS automatically restarts any RDS instance that has been stopped for more than 7 consecutive days. If your hibernation schedule leaves databases stopped for longer (e.g., over a long holiday), AWS will restart them automatically. Plan accordingly.
Snapshot cleanup
Snapshots created by snapshotBeforeStop are not automatically deleted. You are responsible for managing snapshot lifecycle and cleanup to avoid unexpected storage costs.
Troubleshooting¶
Database not stopping¶
- Verify the database status is
available— databases inmodifying,backing-up, or other intermediate states cannot be stopped - Check IAM permissions include
rds:StopDBInstance - Multi-AZ failover may temporarily put the instance in a non-stoppable state
Snapshot taking too long¶
- RDS snapshots for large databases can take significant time
- The executor uses a 30-minute internal timeout for snapshot creation
- Consider disabling
snapshotBeforeStopif automated backups are already configured
Tags not matching¶
- RDS tag matching is case-sensitive
- Verify tags in the AWS Console match exactly
- Remember: empty tag value matches any instance with that tag key