Zero-Downtime Kubernetes Upgrades: How Navos Does It and Why It Matters

A technical deep-dive into how Navos manages zero-downtime Kubernetes cluster upgrades — the sequencing, primitives, and operational process behind every upgrade.

Kubernetes upgrades have a reputation. Engineers who've managed their own clusters know the feeling: a version bump is overdue, a maintenance window gets scheduled, stakeholders get briefed, and you spend the next two hours hoping the ingress controller comes back cleanly.

That's not how upgrades should work.

This is the operational reality of zero-downtime Kubernetes upgrades — the primitives, the sequencing, the pre-conditions, and what we actually do end-to-end when a cluster under our management is due for a version update.

Why Upgrades Are Non-Optional Infrastructure Work

Kubernetes releases a new minor version roughly every four months. Each release gets a 14-month support window — 12 months of active support, followed by a 2-month upgrade period. Once a version falls out of that window, it stops receiving security patches. Running end-of-life Kubernetes in a regulated environment — healthcare, finance, government — isn't a technical debt problem. It's a compliance failure.

There's also a compounding risk. Kubernetes does not support skipping minor versions. You go 1.28 → 1.29 → 1.30, not 1.28 → 1.30. Miss two upgrade cycles and you're looking at a sequential, multi-step migration under pressure.

The operational conclusion: upgrades aren't a project. They're a cadence. And a cadence only runs smoothly when it's automated, tested, and built into the operational model from the start — which is exactly what VEXXHOST's managed Kubernetes is built around.

What "Zero-Downtime" Actually Means

Let's be precise. Zero-downtime doesn't mean nothing changes. Nodes get replaced. Control plane components restart. etcd migrations run. A significant amount of infrastructure is in motion.

What it means is that for correctly configured workloads, traffic never drops. Pods continue serving requests. Persistent volumes remain accessible. Your application doesn't know a version change happened.

The operative phrase is correctly configured. Zero-downtime applies to workloads that are running multiple replicas, have readiness probes defined, are covered by Pod Disruption Budgets, and are deployed on a cluster with adequate spare scheduling capacity and healthy ingress and storage components. Single-replica deployments, stateful workloads without proper disruption budgets, or anything misconfigured going into the upgrade may still experience disruption — and no upgrade mechanism, managed or otherwise, can compensate for that at runtime. That's precisely why we cover pre-conditions in detail below.

That outcome doesn't happen by accident. It requires the right primitives configured correctly before the upgrade starts, and a sequencing model that Kubernetes itself enforces.

The Sequence: Why Control Plane Always Comes First

The upgrade order is governed by the Kubernetes version skew policy. In HA clusters, kube-apiserver instances can only differ by one minor version. kubelet can be up to three minor versions older than kube-apiserver, but never newer. Violate these rules and you're running an untested, unsupported component combination.

The conclusion is non-negotiable: control plane first, worker nodes second.

Control plane rolling update

In a production-grade HA cluster, the upgrade happens sequentially — one control plane node at a time — so the API remains accessible throughout. In Navos, this is driven by Cluster API, the CNCF project that manages Kubernetes cluster lifecycle as declarative Kubernetes-native resources.

To upgrade the control plane, we modify the spec.topology.version field in the KubeadmControlPlane resource. Cluster API provisions a new control plane node at the target version, waits for it to reach Ready, then deprovisions the old one. This repeats for each control plane node. The API server load balancer ensures continuity throughout — the cluster API never becomes unavailable.

Worker node rolling update

With the control plane stable, we update the MachineDeployment resources for each node pool. For every worker node in sequence:

The node is cordoned — no new pods scheduled on it
Pods are gracefully evicted, respecting all configured PDBs
Evicted pods reschedule onto healthy nodes already running the new version
The old node is deprovisioned once empty
A new node at the target version joins and reaches Ready

Workloads run continuously on untouched nodes throughout. At no point is cluster capacity reduced to zero.

The Pre-Conditions That Actually Deliver Zero Downtime

The sequence is necessary. But sequencing alone doesn't guarantee availability. The cluster has to be configured correctly before the first node is drained. This is where most self-managed upgrade attempts break down.

Pod Disruption Budgets (PDBs)

Without PDBs, a node drain can evict every replica of a service simultaneously, causing an outage. PDBs define a minimum availability floor. The drain process respects these constraints — it will not proceed if evicting a pod would breach the budget. VEXXHOST's Kubernetes Enablement builds PDB configuration guidance directly into cluster onboarding.

Readiness Probes

A pod being scheduled is not the same as a pod being ready to serve traffic. Without readiness probes, your load balancer can route requests to a pod that hasn't finished initializing. During the rescheduling that happens in a node drain, this is exactly when tight readiness checks matter most.

Topology Spread Constraints

Replicas spread across nodes and availability zones mean a single node drain can't wipe out your entire service capacity. Topology spread constraints enforce this declaratively — the scheduler maintains the distribution continuously, not just at initial placement.

Pre-Upgrade API Deprecation Auditing

Each Kubernetes minor version can remove APIs deprecated in earlier releases. Manifests still using apps/v1beta1-style API versions will break the moment that API is dropped. Before any Navos cluster is upgraded, we run a pre-upgrade validation pass across all workloads to catch deprecated API usage before it becomes a production incident.

What the Navos Upgrade Process Actually Looks Like

Phase 1: Pre-upgrade validation

Before a single node is touched: all cluster nodes must be in Ready state, no existing PDB violations that would block drains, workload manifests audited for deprecated API usage, and an etcd snapshot captured as a recovery anchor. If any check fails, the upgrade doesn't start. A clean upgrade requires a healthy starting state.

Phase 2: Control plane rolling update

We modify spec.topology.version on the KubeadmControlPlane resource. Cluster API orchestrates from there — new node up, Ready confirmed, old node deprovisioned. Repeated per control plane node, with the API server load balancer maintaining availability throughout.

Phase 3: Worker node rolling update

MachineDeployment resources are updated for each node pool. Nodes cycle through the cordon → drain → reschedule → replace sequence individually. PDBs are respected at every step. Capacity is never zero.

Phase 4: Post-upgrade verification

Full health check pass: node status, system pod versions, CNI and CSI driver compatibility, ingress controller health, resource utilization baselines. Integrated monitoring powered by a modern Prometheus stack means anomalies surface to our operations team immediately — not after you open a support ticket.

Cluster API Is the Foundation, Not an Implementation Detail

Navos is built on unmodified upstream Kubernetes and Cluster API — no forks, no proprietary layers, no lock-in. This architectural choice directly shapes how upgrades work. The entire lifecycle — provisioning, scaling, upgrading, deprovisioning — is driven by the same reconciliation loops that govern any Kubernetes workload.

In Navos, a cluster upgrade is initiated by updating a single field: spec.topology.version on the Cluster resource. From there, Cluster API takes over — reconciling the generated control-plane and worker resources, upgrading control plane machines first and worker machines afterward, in strict order. No scripts, no manual sequencing, no intervention required. The process is reproducible and auditable because it's declared. It runs the same way every time because the same controller drives it every time.

This also means your workloads are fully portable. VEXXHOST only uses upstream Kubernetes, so there's no dependency on proprietary abstractions — deploy on AWS, Azure, GCP, OpenStack, or bare metal and your manifests travel with you.

Production-Ready Means Ready Before You Need It

There's a framing problem in how some teams approach managed Kubernetes. They receive a cluster, then treat production-readiness as a post-provisioning project: add monitoring here, configure PDBs there, wire up alerting eventually.

Navos doesn't work that way. When monitoring, security scanning, upgrades, and compliance reporting come built in rather than bolted on, platform teams can focus on what they're actually building instead of what's keeping them up at night.

VEXXHOST's CNCF-certified Kubernetes service integrates natively with block storage, supports auto-healing, auto-scaling, rolling upgrades, and fully isolated clusters inside private networks — present from day one, not assembled after the fact.

When an upgrade window arrives, we're not starting from "is this cluster ready?" We already know. That's what managed operations actually looks like.

The Upgrade Isn't the Hard Part. The Before-State Is.

The drain-and-cordon sequence works. Cluster API's rolling replacement works. The version skew policy enforces correct ordering. The hard part isn't the mechanics of the upgrade itself — it's whether the cluster was built to survive one before it was ever needed.

PDBs configured. Readiness probes defined. Replicas spread. Deprecated APIs replaced. Monitoring in place.

If your current upgrade process involves maintenance windows, stakeholder briefings, and hope — that's a configuration problem, not a process problem. The cluster wasn't built to upgrade without downtime.

Navos clusters are.

That's not a promise. That's an architecture decision made before the first workload was ever deployed.

Ready to see what production-ready Kubernetes management looks like under the hood? Contact us!

Les dernières de notre équipe

The Hidden Cost of "Free" Managed Kubernetes Explained

The Board Is Asking About Cloud Costs, Here's What to Tell Them

When OpenStack is Not a Good Fit for Your Team