Zero-Downtime OpenStack Upgrades: Open vSwitch Restarts

Reduce dataplane impact during OpenStack upgrades with Atmosphere’s smarter OVS management, on-demand image builds, and x86_64-v2 performance optimizations.

Upgrading OpenStack has traditionally come with an unavoidable risk: brief but frustrating data plane interruptions. One of the most common causes is Open vSwitch (OVS) restarts — especially during control plane or packaging updates that shouldn’t impact networking at all.

Atmosphere, our production-hardened and fully open-source OpenStack distribution, changes that. We’ve introduced a major optimization that dramatically reduces data-plane disruption during upgrades by preventing unnecessary OVS restarts. And it’s already in production.

The OpenStack + Open vSwitch Problem Nobody Talks About

In standard OpenStack deployments, ovs-vswitchd restarts can happen during updates, reconfigurations, or image rollouts — even when nothing in OVS itself changed. When that process restarts:

Flows that depend on userspace upcalls fail temporarily.
Certain east-west and overlay traffic drops packets.
Operators see unexplained blips during maintenance windows.

It’s not a bug — it's a default behavior. But it shouldn’t be triggered every time you update part of the control plane.

Why This Affected OpenStack Upgrades in Atmosphere (Before)

Atmosphere packages Open vSwitch as a container image. Previously, that image lived inside the main Atmosphere build pipeline. So any Atmosphere release — even for Keystone, Cinder, or Cluster API updates — shipped a “new” Open vSwitch image with a new digest.

Kubernetes would detect the updated image and trigger a rollout of the OVS DaemonSet. That restart caused a data-plane flap on every node, every time. Even if nothing changed in Open vSwitch.

This behavior is not unique to Atmosphere — many containerized OpenStack platforms trigger the same problem quietly.

The Fix: Decoupling OVS Builds from OpenStack Releases

Atmosphere now builds and maintains its Open vSwitch image in a dedicated repository. That one change solves a key operational reliability problem in OpenStack-based clouds:

Open vSwitch images only rebuild when there are actual OVS changes
OpenStack/Atmosphere upgrades no longer force Open vSwitch rollouts
Data plane stability is preserved during maintenance

This is a concrete example of how Atmosphere is designed through lived production experience — not theoretical packaging.

Built for Performance, Not Just Stability

While restructuring the build, we optimized the image for modern CPUs:

Compiled for x86_64-v2 instruction set
Unlocks improved performance on newer processors
Boosts DPDK-backed deployments without extra tuning

This means operators get:

Higher throughput
Lower packet-processing overhead
Better CPU efficiency — automatically

Why This Matters for OpenStack Operators

Whether you're running Neutron with OVS, OVN, or DPDK, this solves a class of silent upgrade-impact scenarios that most OpenStack environments simply accept as normal.

With Atmosphere:

OpenStack upgrades stop causing silent data-plane restarts
OVS only rolls when OVS changes
Operators regain control over when data-plane changes happen
Performance improves out of the box

This isn’t just “less downtime” — it’s a better operational model for OpenStack.

Atmosphere: OpenStack Built by People Who Run OpenStack at Scale

Atmosphere is not a wrapper around OpenStack — it is OpenStack, engineered and operated the way real production environments demand. It’s used today in large-scale, performance-sensitive deployments, and that operational experience directly shapes every improvement we ship.

If you're exploring how to reduce data-plane impact during upgrades, modernize OVS handling, or improve OpenStack lifecycle management, this is the kind of hardening you should expect from your platform — and it's exactly what Atmosphere delivers.

Where many OpenStack distributions stop at packaging, Atmosphere goes further by solving the day-2 and day-100 problems you only encounter when you're actually running OpenStack in production. Decoupling OVS image rebuilds, preventing silent dataplane rollouts, and optimizing for modern CPUs are the kinds of changes that come from operating experience — not guesswork.

Zero-Downtime OpenStack Upgrades: Suppressing Open vSwitch Restarts