The Hidden Trade-Offs in Modern Cloud Platforms
Egress fees, lock-in, and pricing complexity aren't accidents. Learn the cloud trade-offs most teams miss and how open infrastructure changes the mode
Insights, updates, and stories from our team
Egress fees, lock-in, and pricing complexity aren't accidents. Learn the cloud trade-offs most teams miss and how open infrastructure changes the mode
Infrastructure decisions aren't just about performance anymore. For Kubernetes teams, where data lives is now the first design constraint.
Most teams think picking an EU region solves data sovereignty. It doesn't. Learn what sovereign Kubernetes actually requires — and how to get there.
Choosing between active-active, active-passive, and federated Kubernetes across regions? A practical breakdown of when each pattern fits, how each fails, and what to get right before you deploy.
VEXXHOST has been running production OpenStack since 2011, managing Kubernetes clusters for Fortune 500 businesses and smaller teams alike. We've seen multi-region setups designed beautifully on whiteboards fail in production in ways that no whiteboard anticipated. We've also seen simpler setups hold together under conditions that should have destroyed them.
This post is an assessment of three common deployment patterns — active-active, active-passive, and federated — covering when each is the right call, how each breaks in practice.
We'll also cover the service mesh question that every team asks eventually, and the latency/consistency trade-off that determines which pattern your application can actually support.
Multi-region architecture is an answer. The question you need to ask first is: what failure are you trying to prevent?
There are four legitimate reasons to go multi-region:
Each of these has a different best-fit pattern. The mistake most teams make is picking a pattern based on what sounds impressive rather than what their failure scenarios require. An active-active deployment sounds more resilient than active-passive (and it is) but it comes with consistency trade-offs that can turn a regional network hiccup into a data integrity incident.
Ask your team before choosing a pattern:
Both regions serve production traffic simultaneously. Load is distributed between them either by geography (closest region wins) or by workload type. If one region fails, the other absorbs its traffic.
Active-active is the right choice when:
The word "stateless" matters more than anything else in that list. If your application doesn't write to a shared data store, active-active is clean and tractable. If it does, you're about to learn about CAP theorem the hard way.
Network partition with divergent writes. This is the scenario that most derails production. Your GeoDNS routes Region A traffic to Region A and Region B traffic to Region B. A BGP issue causes the inter-region link to drop for 90 seconds. Both regions keep accepting writes. When the link recovers, you have conflicting state. If you haven't designed an explicit conflict resolution strategy, you now have a correctness incident that is orders of magnitude harder to debug than an outage.
Asymmetric health. Region B is "up" in the health check sense. That means pods are running, endpoints are responding. But its underlying database replica is 15 minutes behind. GeoDNS still routes traffic there. Users in Amsterdam are reading stale data but getting 200 OK responses. This scenario is especially insidious because your monitoring won't flag it as an outage.
Split-brain on distributed control planes. If you're running a distributed stateful system — etcd, Consul, CockroachDB — across regions, a partition will force a leader election. Depending on your quorum configuration, this can result in one region losing write capability entirely, or in two primaries accepting writes simultaneously. Neither is fun.
Thundering herd on failover. Region A goes down. GeoDNS detects it in 30–60 seconds and reroutes traffic to Region B. Region B now receives 2x its normal load. If it wasn't provisioned for this (and most environments aren't, because running at 50% capacity in each region costs money) you've just turned a regional failure into a global outage.
One region (active) serves all production traffic. The other region (passive) is a warm or cold standby that takes over when the active region fails. The passive region is typically not serving end-user traffic under normal conditions.
Active-passive is the right choice when:
Active-passive is significantly simpler to reason about and audit than active-active. When something breaks, the failure modes are more contained. It's also what most DR compliance frameworks expect when they talk about a "disaster recovery site."
Replication lag that nobody monitors. Your passive region is supposed to be 30 seconds behind. It's actually 4 hours behind because a network event caused the replication stream to restart from scratch, and nobody noticed. You fail over, thinking you'll lose 30 seconds of data, and you end up losing 4 hours. This is not hypothetical — it happens every time replication lag isn't actively monitored and alerted on.
Untested failover. The passive region exists and the runbook exists. However, nobody has run the failover in 18 months. The runbook has a step that says "update the load balancer endpoint" but the person who wrote that used terminology from an old provider, and the new provider's interface is different. Failover tests should be on your regular operational calendar, not saved for actual disasters.
Passive cluster drift. The active cluster is running Kubernetes 1.31. The passive cluster is still on 1.29 because "we'll upgrade it next quarter." You fail over. Your admission webhooks reference APIs that behave differently between versions. Some workloads fail to schedule. This is a mess.
DNS TTL gotchas. You updated your DNS to point at the passive region. Your customers' DNS resolvers are still caching the old record because someone set a 24-hour TTL to reduce DNS load years ago. Some customers experience the outage for an extra day after you've "failed over."
Storage replication and write ordering. Ceph RBD mirroring between regions is asynchronous. During a failover, you need to ensure the replica is promoted to primary and that all in-flight I/O from the former primary is confirmed lost or replayed, depending on your consistency requirements. Getting this wrong produces corrupted application state.
Federation (or its modern descendants, implemented via tools like Liqo, Admiralty, or the Kubernetes SIG Multicluster work) treats multiple clusters as a single logical control plane. You define workloads once and they are scheduled and distributed across member clusters by a federation layer.
Federation is the right choice when:
Federation is not a replacement for active-active or active-passive. It's an orchestration layer that can implement either of those patterns at scale, across more clusters than you could reasonably manage individually.
Complexity without payoff for small deployments. Federation adds a layer of abstraction. That abstraction has a learning curve, operational overhead, and its own failure modes. If you have two regions, federation is almost certainly not worth it. The two-cluster problem is tractable without federation. Start with a hard cluster count of three before even evaluating federation tooling.
Federation control plane becomes a single point of failure. Your clusters are multi-region. Your federation control plane is single-region. The control plane region fails. You can't schedule new workloads or update policies across the remaining clusters. Make sure the federation control plane is itself highly available and ideally spread across a different failure domain.
Policy drift and namespace confusion. Federated namespaces must be carefully managed. It's easy to end up in a state where a namespace exists in the federation layer but not in one member cluster (or vice versa), and federation tooling silently fails to propagate workloads there. Invest in federation status monitoring from day one.
Multi-cluster service discovery. Services in Cluster A can't natively talk to services in Cluster B. You need either a service mesh with multi-cluster capabilities (Istio, Cilium Cluster Mesh) or a custom DNS setup. This is solvable but it's not free, and it adds latency to every cross-cluster call.
Version skew between clusters. The federation API assumes reasonable version parity between member clusters. If one cluster is several minor versions behind, federation controllers may not correctly reconcile workload state.
Every team doing multi-region eventually asks about service mesh. Usually the question is framed as "should we deploy Istio?" The right question is "what problem are we really trying to solve?"
Service meshes in a multi-region context solve specific problems:
mTLS between clusters. If services in Region A are calling services in Region B, those calls traverse public (or semi-public) network paths. A service mesh encrypts and authenticates those calls with mutual TLS without requiring application-level changes.
Traffic policy and retries. You can define circuit breakers and retry budgets at the mesh level, which are especially important for cross-region calls where latency variance is high.
Multi-cluster service discovery. Istio's east-west gateway and Cilium Cluster Mesh both enable services to discover and call each other across cluster boundaries using Kubernetes-native service naming.
Observability. Distributed tracing across cluster boundaries requires the mesh to propagate trace context, which it does automatically once you're on the mesh.
What service meshes don't solve: the CAP theorem. A service mesh with a circuit breaker can prevent your service from taking cascading failures from a degraded downstream — but it can't make a network partition safe for your stateful data. Don't add a service mesh expecting it to solve consistency problems.
⚠️ If you need strong consistency and active-active: The answer is redesigning the parts of your application that require strong consistency, so they don't require it across regions. This is architectural work, not infrastructure work.
Before you put any multi-region setup into production, you must have:
VEXXHOST's Kubernetes platform ships with Prometheus monitoring, Grafana dashboards, and log aggregation built in. That covers within-cluster observability. You also need monitoring for the space between clusters.
At minimum, monitor:
Atmosphere deploys OpenStack services as Kubernetes workloads using Helm charts, enabling rolling upgrades with no tenant disruption. The key principle for multi-region upgrades: sequence them, don't parallelize.
Upgrade the passive/secondary region first. Validate it. Then upgrade the primary. This gives you a rollback target if the upgrade breaks something unexpected. Rolling upgrades within a cluster don't eliminate risk across a multi-cluster upgrade sequence.
Start with active-passive (Montreal primary, Amsterdam passive). Add Cloudflare or another GeoDNS with health checking at the edge to serve static assets from the nearest region. Evaluate active-active in 12 months after you've operated the passive setup and understand your actual failure modes.
Active-passive with Montreal as primary, Amsterdam or Santa Clara as DR. VEXXHOST's Montreal data center provides Canadian data residency. Configure Ceph RBD mirroring for DR data; keep application data writes locked to Montreal under normal operations.
Federation with Argo ApplicationSets across all three regions. Define cluster standards centrally; let teams deploy to region-labeled clusters. Add Cilium Cluster Mesh for service discovery when inter-team service dependencies cross cluster boundaries.
Active-passive with automated failover, tested monthly. The temptation after an outage is to over-engineer toward active-active but if your workloads aren't designed for it, you'll trade one type of incident for another. Test the failover first.
Multi-region Kubernetes on arbitrary infrastructure is a significant integration project. You're assembling OpenStack, Kubernetes, and Ceph from independent sources, validating their interoperability, and then multiplying that complexity across regions.
Atmosphere collapses that integration. OpenStack, Ceph, and Kubernetes are pre-validated as a single unit. No surprise breakage between components.
In a multi-region context, this matters for a range of reasons including:
For teams who need the control of open source with the reliability of managed operations, VEXXHOST runs Atmosphere in fully managed mode. Our engineers contribute to OpenStack, Kubernetes, and Ceph upstream. So, when something breaks, we fix it at the source.
Multi-region Kubernetes is a spectrum. On one end: a simple active-passive setup with a tested failover runbook, costing you one extra Ceph mirror and a monthly failover drill. On the other end: active-active with a distributed data layer, a service mesh, cross-region traffic management, and a team that has genuinely internalized the consistency trade-offs of a distributed system.
Most organizations should start at the simple end and move right only when they have clear evidence that the simple approach isn't meeting a specific requirement.
Atmosphere is fully open source on GitHub. If you're designing a multi-region Kubernetes deployment and want an architecture review, we're happy to talk.
Choose from Atmosphere Cloud, Hosted, or On-Premise.
Simplify your cloud operations with our intuitive dashboard.
Run it yourself, tap our expert support, or opt for full remote operations.
Leverage Terraform, Ansible or APIs directly powered by OpenStack & Kubernetes