Does your AI workload data really stay in the EU? With EU AI compliance getting stricter, see where hyperscaler data flows create risk and how to keep AI compute inside your jurisdiction.
When you send a prompt to a managed AI service, where does your data actually go?
If you're running AI workloads on customer data, patient records, or employee information through a hyperscaler's managed service, the answer matters for GDPR compliance. And in many cases, the answer is "across borders" even when you've selected an EU region.
This post looks at the specific regulatory requirements, where the friction points are, and what infrastructure options exist if you need to keep AI compute within your control boundary.
The Regulatory Requirements
GDPR Article 44 prohibits transfers of personal data to third countries unless specific safeguards are met. "Third country" means anywhere outside the EU/EEA. The regulation applies to data "undergoing processing or intended for processing after transfer."
For AI workloads, this matters because training data containing personal information triggers transfer rules if it leaves the EU. Inference requests that include personal data are transfers if processed outside the EU. Even metadata and telemetry can constitute personal data under GDPR's broad definition.
The EU-US Data Privacy Framework provides an adequacy mechanism for transfers to certified US companies. But adequacy decisions can be invalidated, and not all data categories or processing scenarios are covered. Consequently, this would have to be dealt with on a case-by-case basis.
The EU AI Act adds another layer. It entered into force August 2024. Prohibited AI practices were banned and AI literacy obligations took effect in February 2025. General-purpose AI model transparency requirements became active in August 2025. Full compliance for high-risk AI systems, including conformity assessments and EU database registration, is required 36 months after the act enters into force.
High-risk categories include AI used in healthcare, insurance underwriting, employment decisions, and financial services.
The AI Act requires documentation of data governance practices, audit trails, and human oversight mechanisms. If your AI infrastructure runs on shared multi-tenant systems where you don't control (or even fully see) how data flows, demonstrating compliance gets harder.
Where the Problem Actually Shows Up
Most hyperscalers offer EU regions. But "EU region" doesn't always mean "data stays in EU" for managed AI services.
Backend systems for model training pipelines, support tooling, and operational infrastructure may span regions. Even if your inference happens in Frankfurt, training data or logs might move elsewhere. When you file a support ticket, who accesses your environment? From where? Under which jurisdiction's laws?
The CLOUD Act creates structural exposure for US-headquartered providers, who can be compelled to produce data held anywhere in the world. In a French Senate hearing, one executive stated that the company couldn't guarantee customer data would never be transferred to US authorities.
Managed AI services also involve chains of subprocessors. Each link is a potential transfer you need to account for in your GDPR documentation.
None of this means you can't use hyperscaler AI services. It means your legal and compliance teams need to assess the actual data flows, not just the region label in your console.
Bringing Compute Home
Cloud repatriation is a growing movement. A Barclays CIO survey found 86% of respondents planning to move at least some workloads from public cloud. The drivers vary:
- cost optimization for predictable workloads
- latency requirements, and
- increasingly, data sovereignty
For AI specifically, on-premise or hosted private cloud means your data stays in your datacenter (or a datacenter you contractually control). No ambiguity about transfers, no dependence on adequacy decisions, no CLOUD Act exposure. You control the infrastructure stack, so you know exactly what's running, who has access, and where logs go. This makes compliance documentation straightforward because you're documenting your own systems, not trying to extract guarantees from a vendor.
You also own the economics. GPU-intensive AI workloads on consumption-based cloud pricing get expensive at scale. On dedicated hardware, your cost is predictable.
The trade-off: you take on operational responsibility. Managed AI services handle scaling, model updates, and infrastructure operations for you. Private cloud means you need that capability in-house or through a managed services provider.
How This Works with OpenStack and Atmosphere
Atmosphere is an OpenStack-based distro. It runs on-premise in your datacenter or hosted in VEXXHOST's EU datacenters with managed operations.
- For identity and access management
Atmosphere integrates Keycloak with MFA, session management, LDAP/AD connectivity, and SAML 2.0/OpenID Connect for SSO. You control authentication policy. OpenStack's multi-tenancy model isolates projects at the network level, so your AI training environment doesn't share network segments with other workloads unless you configure it that way. - Key management uses Barbican (OpenStack's key management service)
For encryption key lifecycle—generation, rotation, revocation—with HSM support available for on-premise deployments requiring hardware-backed key storage. Monitoring runs on a Prometheus-based stack with ~300 preconfigured metrics and alerts, capturing what you need for compliance reviews. GPU passthrough is supported for AI/ML workloads on dedicated or bare metal instances.
- Three deployment options
Cloud (multi-tenant, VEXXHOST datacenters, per-minute billing), Hosted (single-tenant dedicated infrastructure, EU datacenters, monthly billing), or On-premise (your datacenter, support-only or full remote operations). For AI workloads on regulated data, the hosted or on-premise options give you the jurisdiction control GDPR requires while the platform handles day-2 operations. Everything from upgrades to monitoring and patching.
What This Doesn't Solve
Private cloud isn't cheaper for all workloads. Bursty, unpredictable workloads often make more economic sense on public cloud. Repatriation math works for sustained, predictable compute like model training, batch inference, persistent services.
You still need AI/ML expertise. Atmosphere gives you infrastructure, not managed AI services. You're running your own training jobs, managing your own models, handling your own MLOps. This requires engineers who know what they're doing.
It's not instant. Deploying private cloud infrastructure takes longer than spinning up a managed service. Plan for weeks to months depending on scope, not hours.
Hyperscaler AI services have capabilities you won't replicate easily. Foundation models, pre-trained APIs, managed fine-tuning pipelines—you'd need to build or source these separately on private infrastructure.
The question isn't "private cloud vs. hyperscaler" for everything. It's which workloads need to stay within your jurisdiction, and what infrastructure supports that requirement while remaining operationally viable.
Making the Assessment
If you're evaluating whether to move AI compute on-premise, start by mapping your AI data flows: what personal data enters AI systems, where does it go during training and inference, which vendors and subprocessors touch it.
Get a legal assessment of transfer mechanisms
Are your current transfers covered by adequacy decisions, SCCs, or other GDPR Chapter V mechanisms? What's the risk exposure if those mechanisms are challenged?
Identify high-risk AI use cases under the AI Act
Which of your AI applications will require conformity assessment? These are candidates for infrastructure you fully control.
Run the cost model for your AI workload profile
What's the total cost of ownership across hyperscaler managed services vs. hosted private cloud vs. on-premise? Include not just compute but compliance overhead, legal exposure, and operational staffing.
Finally, evaluate operational readiness
Do you have the team to operate AI infrastructure? If not, what managed services model makes sense?
There's no universal answer. But for European enterprises running AI on regulated data, the regulatory pressure points are real and increasing. Understanding your options, including private cloud infrastructure that keeps compute in your jurisdiction, is part of responsible planning. If you'd like to evaluate the right strategy for you, schedule a free consultation with a VEXXHOST expert.