Skip to main content

Roadmap

Current version: v0.5.0see the changelog for release history.

v0.5.0 — Observability & Reliability ✓

Metrics, dashboards, and monitoring for production deployments.

  • ✓ Prometheus metrics for controller (reconcile duration, ref resolution, gateway counts, CR info, conditions, per-gateway status)
  • ✓ Prometheus metrics for agent (sync duration, files changed/added/modified/deleted, git fetch, scan, designer sessions, sync skips, gateway startup)
  • ✓ Two Grafana dashboards shipped in Helm chart (fleet overview + per-CR detail with drill-down)
  • ✓ ServiceMonitor and PodMonitor templates with honorLabels support
  • ✓ SSH host key verification with optional knownHosts on SSH auth (fix InsecureIgnoreHostKey)
  • ✓ Exponential backoff for transient git and API errors (30s → 60s → 120s → 5m cap)
  • ✓ Graceful shutdown with in-flight sync completion deadline (prevent partial file writes on SIGTERM)

v0.6.0 — Scale & Operability

Remove scaling walls and make the agent more reactive.

  • Informer-based ConfigMap watch replacing 3s polling in agent
  • Downward API annotation reader — enables stoker.io/ref-override and profile switching without pod restart
  • Per-gateway status ConfigMap sharding (eliminate write contention at 10+ gateways)
  • emptyDir size limit on agent repo volume (prevent node disk pressure from large repos)
  • Webhook receiver rate limiting

v0.7.0 — Conditions & Validation

Operational visibility and safety for fleet management.

  • New condition types: AgentReady, RefSkew
  • Drift detection (re-sync same commit reports unexpected changes)
  • Post-sync health verification (project state, tag providers — not just scan 200)
  • Sync diff report in changes ConfigMap
  • Conflict detection when multiple profiles map to the same destination path
  • Validating admission webhook for GatewaySync CRs (reject invalid CRs at apply time)
  • Structured audit logging (per-sync JSON record: timestamp, commit, author, gateway, files, result)

Future Ideas

These are valuable but not yet scoped into versioned milestones. They'll be prioritized based on user feedback.

Safety & Trust:

  • Designer session project-level granularity (sync Project B while designer has Project A open)
  • Pre-sync backup with auto-rollback on scan failure
  • Module management (.modl sync to modules/ with postAction: restart)
  • Per-CR webhook HMAC secrets (replace global HMAC)
  • Git commit signature verification (GPG/SSH, IEC 62443 compliance)

Reach:

  • Standalone agent mode (systemd/Windows service for bare-metal Ignition servers)
  • Approval annotation gate for production gateways

Enterprise:

  • Maintenance windows and change freeze schedules
  • External audit sink (SIEM integration via webhook/syslog)
  • Drift detection with configurable action (report / restore / alert)
  • Resource quotas and rate limiting for concurrent syncs