Roadmap¶
Current status and future plans for the Hibernator Operator.
For detailed design documents, see the proposals directory.
Quick Status¶
| Component | Status | Details |
|---|---|---|
| Core Operator | v1.x complete | |
| Schedule Exceptions | Phases 1–5 complete | |
| Stateless Error Reporting | Runner writes to termination-log | |
| E2E Tests | Living document, grows over time | |
| Async Reconciler | Default; legacy reconciler removed | |
| Helm Chart | Available | |
| CI/CD Pipeline | GitHub Actions | |
| kubectl hibernator CLI | Available | |
| Notification System | Slack, email, webhook (RFC-0006) | |
| WorkloadScaler Executor | Kubernetes scale subresource | |
| CLI One-Line Installer | curl + bash installer with tarball distribution and checksum verification | |
| GCP Executors | Implemented when use case arises | |
| Azure Executors | Implemented when use case arises |
Completed (v1.x)¶
Core Infrastructure¶
- HibernatePlan CRD with full lifecycle management
- Controller with phase-based state machine
- All 4 execution strategies: Sequential, Parallel, DAG, Staged
- DAG dependency resolver with Kahn's algorithm and cycle detection
- Bounded concurrency control via
maxConcurrency - Validation webhook for schedule format and DAG validation
Executor Ecosystem¶
- AWS Executors: EKS (node groups + Karpenter), RDS, EC2
- Kubernetes Executor: WorkloadScaler (scale subresource)
- Executor registration and pluggable interface
- Per-executor parameter validation
- Restore metadata capture and persistence
Scheduling & Time Management¶
- User-friendly schedule format:
start/end(HH:MM) +daysOfWeek - Cron conversion with timezone support
- Multi-window schedule support (OR-logic evaluation)
- Timezone-aware schedule evaluation
Schedule Exceptions (Phases 1–5)¶
- Independent
ScheduleExceptionCRD - Three exception types: extend, suspend, replace
- Lead-time configuration for suspensions
- Automatic time-based expiration
- Composable multi-exception semantics (mergeByType)
Security & Authentication¶
- Projected ServiceAccount tokens with custom audience
- TokenReview validation for streaming requests
- RBAC enforcement for controller and runner
- IRSA integration for cloud provider credentials
Observability¶
- Structured logging with logr
- Prometheus metrics for execution, reconciliation, pipeline, and notifications
- Per-target execution ledger in plan status
- Streaming infrastructure: gRPC + HTTP webhook fallback
Reliability & Operations¶
- Stateless error reporting via Kubernetes Termination Messages
- Async phase-driven reconciler (Coordinator/Worker actor model)
- Error recovery with exponential backoff and manual retry (
retry-nowannotation) - E2E test framework (lifecycle, execution strategies, schedule exceptions, error recovery)
- Helm Chart packaging
- CI/CD pipeline (GitHub Actions)
- kubectl hibernator CLI plugin
- One-line CLI installer (
curl | bash) with tarball distribution and checksum verification
Planned¶
Near-Term¶
- Lifecycle Processors for Connectors — Introduce active status monitoring and lifecycle management for
K8SClusterandCloudProviderresources. - Concrete E2E Tests with Floci — Integrate floci for real-world end-to-end validation against live cloud environments (AWS, GCP, Azure), validating full hibernation/wakeup cycles on actual infrastructure rather than mocks.
Medium-Term¶
- ScheduleException Target Override — Allow
ScheduleExceptionto override specific targets within a plan rather than the entire schedule. For example, keep the database running during a maintenance window while hibernating compute resources, or vice versa. This enables fine-grained per-target exception control. - Exception Approval Workflows — Slack/email-based approvals (Phase 6+)
Long-Term¶
- Multi-Cluster Management — Cross-cluster hibernation coordination
- Web Dashboard — UI for monitoring and managing plans
- Custom Executor SDK — Framework for building out-of-tree executors
On-Demand¶
The following are not scheduled but will be implemented when a concrete use case is demanded:
- GCP Executors — GKE node pool, Cloud SQL, and Compute Engine support
- Azure Executors — AKS, Azure SQL, and VM management