Skip to content

Roadmap

Current status and future plans for the Hibernator Operator.

For detailed design documents, see the proposals directory.

Quick Status

Component Status Details
Core Operator ✅ Shipped v1.x complete
Schedule Exceptions ✅ Implemented Phases 1–5 complete
Stateless Error Reporting ✅ Implemented Runner writes to termination-log
E2E Tests ✅ Implemented Living document, grows over time
Async Reconciler ✅ Implemented Default; legacy reconciler removed
Helm Chart ✅ Implemented Available
CI/CD Pipeline ✅ Implemented GitHub Actions
kubectl hibernator CLI ✅ Implemented Available
Notification System ✅ Implemented Slack, email, webhook (RFC-0006)
WorkloadScaler Executor ✅ Implemented Kubernetes scale subresource
CLI One-Line Installer ✅ Implemented curl + bash installer with tarball distribution and checksum verification
GCP Executors 💤 On-Demand Implemented when use case arises
Azure Executors 💤 On-Demand Implemented when use case arises

Completed (v1.x)

Core Infrastructure

  • HibernatePlan CRD with full lifecycle management
  • Controller with phase-based state machine
  • All 4 execution strategies: Sequential, Parallel, DAG, Staged
  • DAG dependency resolver with Kahn's algorithm and cycle detection
  • Bounded concurrency control via maxConcurrency
  • Validation webhook for schedule format and DAG validation

Executor Ecosystem

  • AWS Executors: EKS (node groups + Karpenter), RDS, EC2
  • Kubernetes Executor: WorkloadScaler (scale subresource)
  • Executor registration and pluggable interface
  • Per-executor parameter validation
  • Restore metadata capture and persistence

Scheduling & Time Management

  • User-friendly schedule format: start/end (HH:MM) + daysOfWeek
  • Cron conversion with timezone support
  • Multi-window schedule support (OR-logic evaluation)
  • Timezone-aware schedule evaluation

Schedule Exceptions (Phases 1–5)

  • Independent ScheduleException CRD
  • Three exception types: extend, suspend, replace
  • Lead-time configuration for suspensions
  • Automatic time-based expiration
  • Composable multi-exception semantics (mergeByType)

Security & Authentication

  • Projected ServiceAccount tokens with custom audience
  • TokenReview validation for streaming requests
  • RBAC enforcement for controller and runner
  • IRSA integration for cloud provider credentials

Observability

  • Structured logging with logr
  • Prometheus metrics for execution, reconciliation, pipeline, and notifications
  • Per-target execution ledger in plan status
  • Streaming infrastructure: gRPC + HTTP webhook fallback

Reliability & Operations

  • Stateless error reporting via Kubernetes Termination Messages
  • Async phase-driven reconciler (Coordinator/Worker actor model)
  • Error recovery with exponential backoff and manual retry (retry-now annotation)
  • E2E test framework (lifecycle, execution strategies, schedule exceptions, error recovery)
  • Helm Chart packaging
  • CI/CD pipeline (GitHub Actions)
  • kubectl hibernator CLI plugin
  • One-line CLI installer (curl | bash) with tarball distribution and checksum verification

Planned

Near-Term

  • Lifecycle Processors for Connectors — Introduce active status monitoring and lifecycle management for K8SCluster and CloudProvider resources.
  • Concrete E2E Tests with Floci — Integrate floci for real-world end-to-end validation against live cloud environments (AWS, GCP, Azure), validating full hibernation/wakeup cycles on actual infrastructure rather than mocks.

Medium-Term

  • ScheduleException Target Override — Allow ScheduleException to override specific targets within a plan rather than the entire schedule. For example, keep the database running during a maintenance window while hibernating compute resources, or vice versa. This enables fine-grained per-target exception control.
  • Exception Approval Workflows — Slack/email-based approvals (Phase 6+)

Long-Term

  • Multi-Cluster Management — Cross-cluster hibernation coordination
  • Web Dashboard — UI for monitoring and managing plans
  • Custom Executor SDK — Framework for building out-of-tree executors

On-Demand

The following are not scheduled but will be implemented when a concrete use case is demanded:

  • GCP Executors — GKE node pool, Cloud SQL, and Compute Engine support
  • Azure Executors — AKS, Azure SQL, and VM management