Header Image

As AI moves from experimentation to production, enterprises face a critical challenge: most data they need exists outside the public cloud. Patient records, market research, legacy systems containing enterprise knowledge—all this sensitive information creates a fundamental trust problem when deploying AI at scale.

NVIDIA’s latest reference architecture addresses this head-on with a zero-trust approach to AI factories powered by confidential computing. Let’s break down why this matters and how it works.

The AI Factory Trust Dilemma

When deploying proprietary frontier models on shared infrastructure, three stakeholders each have legitimate security concerns:

1. Model Owners vs. Infrastructure Providers

Model owners need to protect their IP—model weights and algorithmic logic. They can’t trust that the host OS, hypervisor, or root administrator won’t inspect or extract their model.

2. Infrastructure Providers vs. Model Owners

Infrastructure providers running the hardware can’t trust that a model owner’s workload is benign. It might contain malicious code or attempt privilege escalation.

3. Data Owners (Tenants) vs. Everyone

Data owners must ensure their regulated data remains confidential. They can’t trust the infrastructure provider won’t view data during execution, or that the model provider won’t misuse it.

The root cause? In traditional computing, data in use isn’t encrypted. Sensitive data and proprietary models sit exposed in plaintext memory, visible to system administrators.

Confidential Computing: The Solution

Confidential computing solves this by encrypting data throughout the entire lifecycle of execution, not just at rest or in transit. Using hardware-backed Trusted Execution Environments (TEEs), data and models remain cryptographically protected even while being processed.

NVIDIA’s approach combines:

How It Works: The Attestation Flow

When you deploy an encrypted model, here’s what happens:

  1. Workload requests secrets (like model decryption keys)
  2. Attestation Agent inside the Kata VM gathers hardware evidence from the TEE
  3. Key Broker Service (KBS) forwards evidence to the Attestation Service
  4. Attestation Service validates against security policies and delegates to vendor services (like NVIDIA Remote Attestation Service)
  5. Mathematical proof confirms the environment is secure and untampered
  6. Keys released into protected memory—model decrypts exclusively inside the TEE

The host OS, hypervisor, and administrators never see the plaintext model or data.

What CoCo Protects (and Doesn’t)

Protected ✅

Not Protected ⚠️

Real-World Impact

This architecture enables:

The Ecosystem

NVIDIA is building this with partners including Red Hat, Intel, Anjuna Security, Fortanix, Edgeless Systems, Dell, HPE, Lenovo, Cisco, and Supermicro. The approach leverages open source projects like Kata Containers and works with standard Kubernetes primitives.

Critically, this is a “lift-and-shift” deployment—no need to rewrite manifests or applications. The NVIDIA GPU Operator manages the stack using familiar Kubernetes workflows.

Why This Matters

As AI adoption accelerates, trust becomes infrastructure. Organizations won’t deploy AI at scale if they can’t guarantee data privacy and model IP protection. By shifting the trust boundary from infrastructure administrators to hardware-backed cryptography, confidential computing removes the blocker.

The result? AI factories that can:

Zero-trust isn’t just a security posture anymore—it’s the foundation for the next generation of AI infrastructure.


Learn more: NVIDIA Confidential Computing Reference Architecture

Source: NVIDIA Technical Blog