Skip to content

Kubernetes Cluster Upgrade Policy

To maintain a secure, stable, and supported platform, we regularly upgrade our Kubernetes clusters. We use RKE2 as our Kubernetes distribution.

Upgrade Flow

Phased Rollout

  • Upgrades are first applied to TDS clusters (Test and Development Systems).
  • After a minimum of 2 weeks, if no critical issues are observed, the same upgrade will be applied to PROD clusters.

No Fixed Schedule

  • Upgrades are not done on a strict calendar basis.
  • Timing may depend on compatibility with other infrastructure components (e.g., storage, CNI plugins, monitoring tools).
  • However, all clusters will be upgraded before the current Kubernetes version reaches End of Life (EOL).

Upgrade Impact

The impact of a Kubernetes upgrade can vary, depending on the nature of the changes involved:

Minimal Impact

  • For example, upgrades that affect only the kubelet may be transparent to workloads.
  • Rolling restarts may occur, but no downtime is expected for well-configured applications.

Potentially Disruptive

  • Upgrades involving components such as the CNI (Container Network Interface) may cause temporary network interruptions.
  • Other control plane or critical component updates might cause short-lived disruption to scheduling or connectivity.
Applications that follow cloud-native best practices (e.g., readiness probes, multiple replicas, graceful shutdown handling) are less likely to be impacted by upgrades.

What You Can Expect

  • Upgrades are performed using safe, tested procedures with minimal risk to production workloads.
  • TDS clusters serve as a canary environment, allowing us to identify issues early.
  • All clusters are kept aligned with supported Kubernetes versions.