Header Image

As AI moves from experimentation to production, enterprises face a critical challenge: most data they need exists outside the public cloud. Patient records, market research, legacy systems containing enterprise knowledge—all this sensitive information creates a fundamental trust problem when deploying AI at scale.

NVIDIA’s latest reference architecture addresses this head-on with a zero-trust approach to AI factories powered by confidential computing. Let’s break down why this matters and how it works.

The AI Factory Trust Dilemma

When deploying proprietary frontier models on shared infrastructure, three stakeholders each have legitimate security concerns:

1. Model Owners vs. Infrastructure Providers

Model owners need to protect their IP—model weights and algorithmic logic. They can’t trust that the host OS, hypervisor, or root administrator won’t inspect or extract their model.

2. Infrastructure Providers vs. Model Owners

Infrastructure providers running the hardware can’t trust that a model owner’s workload is benign. It might contain malicious code or attempt privilege escalation.

3. Data Owners (Tenants) vs. Everyone

Data owners must ensure their regulated data remains confidential. They can’t trust the infrastructure provider won’t view data during execution, or that the model provider won’t misuse it.

The root cause? In traditional computing, data in use isn’t encrypted. Sensitive data and proprietary models sit exposed in plaintext memory, visible to system administrators.

Confidential Computing: The Solution

Confidential computing solves this by encrypting data throughout the entire lifecycle of execution, not just at rest or in transit. Using hardware-backed Trusted Execution Environments (TEEs), data and models remain cryptographically protected even while being processed.

NVIDIA’s approach combines:

Hardware root of trust: CPU TEEs paired with NVIDIA confidential GPUs (Hopper/Blackwell) for memory-encrypted AI workloads
Kata Containers runtime: Wrapping Kubernetes Pods in lightweight, hardware-isolated VMs instead of sharing the host kernel
Remote attestation: Cryptographic proof that the execution environment is secure before releasing decryption keys
Confidential Containers (CoCo): Kubernetes-native operationalization without requiring application rewrites

How It Works: The Attestation Flow

When you deploy an encrypted model, here’s what happens:

Workload requests secrets (like model decryption keys)
Attestation Agent inside the Kata VM gathers hardware evidence from the TEE
Key Broker Service (KBS) forwards evidence to the Attestation Service
Attestation Service validates against security policies and delegates to vendor services (like NVIDIA Remote Attestation Service)
Mathematical proof confirms the environment is secure and untampered
Keys released into protected memory—model decrypts exclusively inside the TEE

The host OS, hypervisor, and administrators never see the plaintext model or data.

What CoCo Protects (and Doesn’t)

Protected ✅

Data and model protection through memory encryption
Execution integrity via remote attestation
Secure image handling—containers pulled directly into encrypted guest
Protection from host-level access and memory inspection

Not Protected ⚠️

Application vulnerabilities within the enclave
Availability attacks (host can still refuse scheduling)
Network security between applications (requires separate secure channels)
Software-based isolation (requires hardware TEEs)

Real-World Impact

This architecture enables:

Healthcare: Processing patient records with frontier models without exposing PHI
Financial services: Running sensitive market research through AI without data leaks
Enterprise AI: Deploying proprietary models on customer infrastructure with IP protection
Regulated industries: Meeting compliance requirements while leveraging powerful AI

The Ecosystem

NVIDIA is building this with partners including Red Hat, Intel, Anjuna Security, Fortanix, Edgeless Systems, Dell, HPE, Lenovo, Cisco, and Supermicro. The approach leverages open source projects like Kata Containers and works with standard Kubernetes primitives.

Critically, this is a “lift-and-shift” deployment—no need to rewrite manifests or applications. The NVIDIA GPU Operator manages the stack using familiar Kubernetes workflows.

Why This Matters

As AI adoption accelerates, trust becomes infrastructure. Organizations won’t deploy AI at scale if they can’t guarantee data privacy and model IP protection. By shifting the trust boundary from infrastructure administrators to hardware-backed cryptography, confidential computing removes the blocker.

The result? AI factories that can:

Deploy proprietary models securely on shared infrastructure
Process sensitive data without exposure risk
Maintain compliance while leveraging frontier AI capabilities
Enable model providers to release IP to customer environments safely

Zero-trust isn’t just a security posture anymore—it’s the foundation for the next generation of AI infrastructure.

Learn more: NVIDIA Confidential Computing Reference Architecture

Source: NVIDIA Technical Blog