Zero Trust Networking on Kubernetes: A Practical Guide

Learn how mTLS, workload identity, and fine-grained authorization enforce Zero Trust in Kubernetes, a practical guide using Istio, Cilium, and the Navos stack.

There's a gap between how Zero Trust gets described and how it gets built.

The architecture diagrams look clean. Certificates here. Policies there. Everything verified. Nothing trusted. But the moment you're standing in front of a running Kubernetes cluster trying to figure out which services are encrypting traffic and what would happen if one pod got compromised, the diagram stops being useful.

This post closes that gap. Here's what Zero Trust networking looks like in a real Kubernetes cluster: which tools enforce it, what those tools do, and how the Navos stack, built on Cilium and Istio, makes it operational rather than aspirational.

The Problem with Kubernetes's Default Network

Kubernetes gives every pod a unique IP and a flat network. Every container can talk to every other container with no restrictions unless you explicitly say otherwise.

That's fine for getting started. It's not fine for production workloads with sensitive data and compliance obligations. A compromised pod has the same default network access as a legitimate one. That's the core flaw in perimeter-based thinking, and it's exactly what Zero Trust is designed to fix.

Zero Trust eliminates the concept of a trusted interior. Every service-to-service call requires cryptographic proof of identity, regardless of origin. In Kubernetes, that means rethinking how identity, encryption, and authorization are managed at the workload level, not just the cluster edge.

Four Things Zero Trust in Kubernetes Actually Requires

1. Workload identity decoupled from IP address Pods are ephemeral. IPs rotate constantly. Each workload needs a cryptographic credential tied to what it is, typically a service account bound to a short-lived certificate, so policies can express which identities are permitted to communicate.

2. Mutual TLS on every service-to-service connection Standard TLS authenticates the server. mTLS requires both sides to present and verify certificates before a connection is established. All traffic between mesh-enrolled workloads is encrypted with mTLS when strict mesh policy and traffic capture are correctly configured, and only authenticated workloads can communicate. No manual certificate management required.

3. Fine-grained, identity-based authorization Authentication proves who a workload is. Authorization determines what it's allowed to do. "Any pod in namespace A can call B" isn't Zero Trust. "Only workload identity X can call service Y on port Z via POST" is.

4. Continuous enforcement Policy configured once and never revisited isn't a Zero Trust posture; it's a snapshot. Policies need to be reviewed, audited, and enforced in real time as the cluster evolves.

Why a Service Mesh Is the Right Tool

Kubernetes-native NetworkPolicies and RBAC get you part of the way there. But they operate at L3/L4 and the API control plane; they don't give you cryptographic identity or per-connection encryption between workloads.

A service mesh handles this at the data plane. By injecting sidecar proxies, it manages certificate issuance, rotation, and revocation, and enforces authorization at the application layer without requiring any code changes to your services. The mesh provides these controls uniformly across every microservice, giving you a single place to configure and audit them.

mTLS in Strict Mode: The Starting Point

In Navos, Istio runs as part of the default stack. The starting point for Zero Trust is enabling strict mTLS cluster-wide via a PeerAuthentication policy scoped to istio-system with mode: STRICT.

This single policy changes the cluster's security baseline. Any pod not enrolled in the mesh, or any client that doesn't present a valid certificate, has its connection dropped. The cluster stops trusting the network.

From here, Istio uses SPIFFE-based workload identity. Each service receives a short-lived X.509 certificate embedding a SPIFFE ID — a stable, cryptographically verifiable identity tied to the workload's Kubernetes ServiceAccount, not its IP. Certificate lifetimes are intentionally short, commonly around 24 hours, with proactive rotation to minimize exposure if a credential is compromised.

L7 Authorization: Who Can Call What

mTLS authenticates workloads. AuthorizationPolicy determines what authenticated workloads are permitted to do. For example, locking down payment-service so only checkout-service can reach it, the policy's source.principals field references the SPIFFE ID of the caller's ServiceAccount directly:

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy

metadata:

name: allow-checkout-to-payment

namespace: production

spec:

selector:

matchLabels:

app: payment-service

action: ALLOW

rules:

- from:

- source:

principals:

- "cluster.local/ns/production/sa/checkout-service"

to:

- operation:

methods: ["POST"]

paths: ["/v1/payments/*"]

The network path doesn't grant access. The authenticated workload identity does. In this policy, the pod's IP address is not the source of trust; the caller's cryptographically verified identity is what the policy evaluates.

Cilium at L3/L4: Defense in Depth

Istio handles L7. Zero Trust requires both layers. This is where Cilium,the other core networking component in Navos plays its role.

Cilium is an eBPF-based CNI that intercepts and filters network traffic directly within the Linux kernel. Like Istio, it decouples identity from IP addresses, Cilium derives workload identity from labels rather than network location, so policies stay accurate even as pods are rescheduled across nodes.

The layered approach starts with a default-deny NetworkPolicy using podSelector: {} with policyTypes: [Ingress, Egress] explicitly set, both are required, since defaulting behavior can vary depending on whether ingress or egress rules are present. DNS egress should also be included in the follow-on allow rules, or name resolution will silently break. From there, CiliumNetworkPolicy adds explicit L7-aware allowlisting at the CNI level, specifying not just which endpoints can talk, but which HTTP methods and paths are permitted between them.

Together, these two layers give you defense in depth: if one layer has a misconfiguration, the other still holds. Cilium's Hubble component adds real-time flow-level observability, supporting policy decision inspection, troubleshooting, and anomaly surfacing when paired with alerts or dashboards. In a Zero Trust environment, the ability to verify enforcement is as important as the enforcement itself.

RBAC Is Not Enough on Its Own

Teams often ask: "We already have RBAC, isn't that enough?"

RBAC controls who can interact with the Kubernetes API. It doesn't encrypt traffic between pods. It doesn't verify service identity at the connection level. It doesn't prevent a compromised pod from calling other services laterally.

Think of RBAC as controlling access to the front door. Zero Trust also secures the rooms and hallways inside. A complete posture requires RBAC and mTLS and NetworkPolicy and admission controls, each addressing a different part of the threat model. No single layer is sufficient alone.

How This Operates in Navos

Navos is built on unmodified upstream Kubernetes and Cluster API. No forks, no proprietary enforcement layer, every policy you write is a standard Kubernetes resource, and the full stack is auditable.

For Zero Trust, the tooling is present from day one:

Cilium — eBPF-based L3/L4 enforcement with Hubble observability
Istio — strict mTLS and L7 AuthorizationPolicy
cert-manager — certificate lifecycle management, integrates with external CAs and your existing PKI
Prometheus + Grafana — policy violation detection and traffic anomaly surfacing in real time

Navos supports deployment on AWS, Azure, GCP, OpenStack, or bare metal, with built-in monitoring, security scanning, and expert support. Run it yourself with VEXXHOST's guidance, or let the team operate it entirely, on VEXXHOST's cloud or in your own data center.

The result is a Zero Trust posture that's built in, not bolted on.

Rolling It Out Without an Incident

You don't have to flip everything to strict mode in a single change window. Start with Istio in PERMISSIVE mode, mTLS where available, plaintext still accepted. Audit which services send plaintext. Remediate. Then switch to STRICT.

Apply default-deny NetworkPolicies at the namespace level early. Assign unique ServiceAccounts per workload before tightening RBAC. The tooling supports a staged rollout — the important thing is making sure permissive mode doesn't become permanent.

What You Actually Get

With this stack in place:

Every enrolled service-to-service connection is mutually authenticated with short-lived certificates
Traffic is encrypted within the cluster, not just at the edge
Authorization is enforced by workload identity, not IP
A compromised pod cannot freely access protected services unless it holds an allowed identity and an explicitly permitted path through both the mesh authorization layer and the CNI network policy layer, structurally constraining lateral movement rather than relying on runtime detection alone.

That's the real value. Not theoretical Zero Trust purity, practical blast radius containment when something goes wrong.

If you're building on Navos, Cilium and Istio are already in the stack. The foundation is there. Enforcing Zero Trust becomes an operational decision, not an engineering project.

Ready to talk about Zero Trust architecture for your Kubernetes environment? Get in touch with the VEXXHOST team.

Learn how mTLS, workload identity, and fine-grained authorization enforce Zero Trust in Kubernetes, a practical guide using Istio, Cilium, and the Navos stack.

There's a gap between how Zero Trust gets described and how it gets built.

The Problem with Kubernetes's Default Network

Kubernetes gives every pod a unique IP and a flat network. Every container can talk to every other container with no restrictions unless you explicitly say otherwise.

Four Things Zero Trust in Kubernetes Actually Requires