- Published on
[Virtualization] 10. The Future of Virtualization: From Confidential Computing to GPU Disaggregation
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- Confidential Computing
- ARM Virtualization
- GPU Disaggregation
- NVIDIA BlueField DPU
- WebAssembly (Wasm) Virtualization
- Kata Containers
- Unikernels
- Serverless GPU
- Edge Computing Virtualization
- Green Computing and Virtualization
- Series Summary
- Conclusion
Introduction
Virtualization technology has continuously evolved from the mainframe era to today's cloud-native age. This final post in the series explores the key technologies shaping the future of virtualization.
Confidential Computing
Overview
Confidential computing processes data in an encrypted state even while in use. Previously, encryption was only possible for data at rest and in transit.
Traditional Encryption:
+------------------+ +------------------+ +------------------+
| At Rest | | In Transit | | In Use |
| AES Encryption | --> | TLS/SSL | --> | Plaintext |
| [Protected] | | [Protected] | | [Vulnerable] |
+------------------+ +------------------+ +------------------+
Confidential Computing:
+------------------+ +------------------+ +------------------+
| At Rest | | In Transit | | In Use |
| AES Encryption | --> | TLS/SSL | --> | Memory Encrypted |
| [Protected] | | [Protected] | | [Protected] |
+------------------+ +------------------+ +------------------+
Hardware Technologies
AMD SEV (Secure Encrypted Virtualization)
+------------------------------------------+
| AMD SEV Architecture |
| |
| +------------------+ +---------------+ |
| | VM 1 | | VM 2 | |
| | Unique Enc. Key | | Unique Enc. Key| |
| | [Memory Encrypted]| | [Memory Encrypted]| |
| +------------------+ +---------------+ |
| |
| +--------------------------------------+|
| | AMD Secure Processor (PSP) ||
| | - Key management ||
| | - Memory Encryption Engine (SME) ||
| | - SEV-ES: Register state encryption ||
| | - SEV-SNP: Integrity protection ||
| +--------------------------------------+|
+------------------------------------------+
- SEV: VM memory encryption
- SEV-ES: Adds CPU register state encryption
- SEV-SNP: Adds memory integrity protection (defends against hypervisor attacks)
Intel TDX (Trust Domain Extensions)
- Provides isolated execution environments called Trust Domains (TD)
- Protects VMs from the hypervisor
- Supports remote attestation
ARM CCA (Confidential Compute Architecture)
- Introduces a new execution environment called Realm
- Applied to ARM-based servers and edge devices
- Unified security model from mobile to data center
Cloud Services
| Service | Foundation | Features |
|---|---|---|
| AWS Nitro Enclaves | AWS Nitro System | Isolated compute, no networking |
| Azure Confidential VMs | AMD SEV-SNP / Intel TDX | Full VM memory encryption |
| GCP Confidential VMs | AMD SEV | Lift-and-shift migration |
ARM Virtualization
Apple Virtualization.framework
A native framework for running lightweight VMs on macOS.
+------------------------------------------+
| macOS (Apple Silicon) |
| |
| +------------------------------------+ |
| | Virtualization.framework | |
| | | |
| | +----------+ +----------+ | |
| | | Linux VM | | macOS VM | | |
| | | (ARM64) | | (ARM64) | | |
| | +----------+ +----------+ | |
| | | |
| | - Rosetta 2 x86 emulation support | |
| | - virtio device models | |
| | - GPU acceleration (Metal) | |
| +------------------------------------+ |
+------------------------------------------+
Ampere Altra / AWS Graviton
ARM-based server processors are rapidly expanding in data centers.
| Processor | Core Count | Use Case | Virtualization |
|---|---|---|---|
| Ampere Altra Max | 128 | Cloud native | KVM/QEMU |
| AWS Graviton 4 | 96 | AWS EC2 | Nitro |
| NVIDIA Grace | 72 | AI/HPC | KVM |
Benefits of ARM virtualization include the following.
- Excellent power efficiency (performance per watt)
- High VM density relative to core count
- Cloud cost reduction (20-40% vs x86)
GPU Disaggregation
Concept
An architecture that separates GPUs from compute nodes and shares them over the network.
Traditional Architecture:
+------------------+ +------------------+ +------------------+
| Server 1 | | Server 2 | | Server 3 |
| CPU + GPU x4 | | CPU + GPU x4 | | CPU (no GPU) |
| [GPU may waste] | | [GPU shortage] | | [needs GPU] |
+------------------+ +------------------+ +------------------+
Disaggregation:
+------------------+ +------------------+ +------------------+
| Server 1 | | Server 2 | | Server 3 |
| CPU | | CPU | | CPU |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
+--------v---------------------v---------------------v---------+
| GPU Pool (Network Connected) |
| +------+ +------+ +------+ +------+ +------+ +------+ |
| | GPU | | GPU | | GPU | | GPU | | GPU | | GPU | |
| +------+ +------+ +------+ +------+ +------+ +------+ |
+--------------------------------------------------------------+
Interconnect Technologies
| Technology | Bandwidth | Latency | Use Case |
|---|---|---|---|
| PCIe Gen5 | 128 GB/s (x16) | Very low | Local |
| CXL 3.0 | 128 GB/s | Low | Intra-rack |
| NVLink 5.0 | 1.8 TB/s (GPU-to-GPU) | Very low | GPU cluster |
| InfiniBand NDR | 400 Gb/s | Medium | Data center |
CXL (Compute Express Link)
CXL is the key interconnect technology for GPU disaggregation.
CXL Memory Pooling:
+--------+ +--------+ +--------+
| CPU 1 | | CPU 2 | | GPU |
+---+----+ +---+----+ +---+----+
| | |
+---v------------v------------v---+
| CXL Switch |
+---+------------+------------+---+
| | |
+---v----+ +---v----+ +---v----+
| Memory | | Memory | | Memory |
| Pool 1 | | Pool 2 | | Pool 3 |
+--------+ +--------+ +--------+
NVIDIA BlueField DPU
Offloading Virtualization to SmartNICs
BlueField DPU offloads virtualization, networking, and storage processing from the host CPU to the SmartNIC.
+------------------------------------------+
| Host Server |
| +------------------------------------+ |
| | CPU | |
| | - Runs application workloads only | |
| | - No virtualization overhead | |
| +------------------------------------+ |
| | |
| +------------------------------------+ |
| | BlueField DPU | |
| | | |
| | - Hypervisor on ARM cores | |
| | - OVS acceleration (networking) | |
| | - SNAP acceleration (storage) | |
| | - IPsec/TLS hardware acceleration | |
| | - Zero-trust security | |
| +------------------------------------+ |
+------------------------------------------+
Benefits
- 100% host CPU cycles for applications
- Eliminates network/storage virtualization overhead
- Hardware-level isolation between infrastructure and tenants
- GPU Direct RDMA acceleration
WebAssembly (Wasm) Virtualization
Lightweight Virtualization Alternative
WebAssembly is a new execution environment lighter than containers and faster than VMs.
Comparison:
VM Container Wasm
+--------+ +--------+ +--------+
| App | | App | | App |
+--------+ +--------+ +--------+
| Guest | | Libs | | Wasm |
| OS | +--------+ | Runtime|
+--------+ | Container| +--------+
| Hypervisor| | Runtime |
+--------+ +--------+
| Host OS| | Host OS| | Host OS|
+--------+ +--------+ +--------+
Startup: sec~min millisec microsec
Memory: GB MB KB~MB
Isolation: Strong Medium Sandbox
Key Wasm Runtimes
| Runtime | Features | Use Case |
|---|---|---|
| Wasmtime | Bytecode Alliance, standards-compliant | Server-side |
| WasmEdge | CNCF Sandbox, lightweight | Edge/IoT |
| Spin | Fermyon, serverless framework | Microservices |
| WAMR | Intel, ultra-lightweight | Embedded |
Kubernetes Integration
# Wasm workload example via SpinKube
apiVersion: core.spinoperator.dev/v1alpha1
kind: SpinApp
metadata:
name: my-wasm-app
spec:
image: ghcr.io/my-org/my-wasm-app:latest
replicas: 3
executor: containerd-shim-spin
Kata Containers
VM-Isolated Containers
Kata Containers runs each container (or Pod) inside a lightweight VM, providing hardware-level isolation.
Standard Containers:
+------------------+
| Container 1 | Container 2 |
| | |
| Share same kernel| Share same kernel|
+------------------+-------------------+
| Host Kernel |
+--------------------------------------+
Kata Containers:
+------------------+ +------------------+
| Container 1 | | Container 2 |
+------------------+ +------------------+
| Guest Kernel 1 | | Guest Kernel 2 |
+------------------+ +------------------+
| Lightweight VM 1 | | Lightweight VM 2 |
+------------------+ +------------------+
| Host Kernel + KVM |
+----------------------------------------+
Benefits
- Container ease of use + VM-level isolation
- OCI compatible, uses existing container images
- Strong security in multi-tenant environments
- Directly usable as a Kubernetes runtime
Unikernels
Purpose-Built Minimal VMs
Unikernels are ultra-lightweight VMs that include only the OS functions required by the application.
Standard VM:
+----------------------------------+
| Application |
+----------------------------------+
| Full OS (kernel + userspace) |
| - Filesystem, networking, procs |
| - Unnecessary services, daemons |
| - Large attack surface |
+----------------------------------+
Size: Hundreds of MB to GB
Unikernel:
+----------------------------------+
| Application + required OS only |
| - Minimal network stack |
| - Minimal memory management |
| - Single address space |
+----------------------------------+
Size: A few MB to tens of MB
Key Unikernel Projects
| Project | Language | Features |
|---|---|---|
| MirageOS | OCaml | Academic, type-safe |
| Unikraft | C/C++ | Modular, POSIX compatible |
| Nanos | C | General purpose, easy to run |
| OSv | Java/C | JVM optimized |
Serverless GPU
On-Demand GPU Allocation
A paradigm for using GPUs at the function level without managing VMs or containers.
Traditional:
User --> Provision VM/Container --> Allocate GPU --> Run application
(minutes) (fixed alloc)
Serverless GPU:
User --> API Call --> Auto GPU alloc --> Execute function --> Auto GPU release
(millisec) (on-demand) (minimize cost)
Key Services
| Service | Features |
|---|---|
| AWS Lambda (no GPU yet) | Serverless standard, GPU coming |
| Modal | Python native, GPU serverless |
| Banana | ML inference specialized |
| RunPod Serverless | GPU serverless containers |
Edge Computing Virtualization
Challenges
Virtualization at the edge faces unique challenges of resource constraints and management complexity.
Cloud Edge
+------------------+ +------------------+
| Abundant resources| | Limited resources |
| Stable network | | Unstable connection|
| Central mgmt | | Distributed mgmt |
| Physical security| | Physical risk |
+------------------+ +------------------+
Edge-Suitable Virtualization Technologies
| Technology | Resource Needs | Startup Time | Suitability |
|---|---|---|---|
| QEMU/KVM (microVM) | Low | Under 100ms | High |
| Kata Containers | Low | Under 100ms | High |
| WebAssembly | Very low | Microseconds | Very high |
| Unikernels | Very low | Milliseconds | High |
| Standard VM | High | Seconds to minutes | Low |
Green Computing and Virtualization
Energy Efficiency
Virtualization is a key technology for reducing energy consumption by increasing hardware utilization.
Without Virtualization:
+--------+ +--------+ +--------+ +--------+
|Server 1| |Server 2| |Server 3| |Server 4|
| Usage | | Usage | | Usage | | Usage |
| 15% | | 10% | | 20% | | 5% |
+--------+ +--------+ +--------+ +--------+
Total: 4 servers, Average usage: 12.5%
With Virtualization:
+--------+
|Server 1|
| Usage |
| 70% | <-- Consolidate 4 servers
+--------+
Total: 1 server, Power savings: 60-75%
Future Technologies
- Workload-aware GPU power management
- Carbon-aware scheduling
- Batch jobs aligned with renewable energy availability
- AI-based VM placement optimization
Series Summary
Here is a summary of the topics covered in this series.
| Post | Topic | Key Content |
|---|---|---|
| 06 | KubeVirt | K8s native VM execution, CRDs, CDI, live migration |
| 07 | GPU Operator | GPU software stack automation, ClusterPolicy, MIG |
| 08 | KubeVirt + GPU | VM GPU passthrough/vGPU, VFIO, Sandbox Plugin |
| 09 | Platform Comparison | QEMU/VBox/VMware/KubeVirt comprehensive comparison |
| 10 | Future Tech | Confidential computing, GPU disaggregation, Wasm |
Conclusion
Virtualization technology is evolving beyond simple hardware abstraction into diverse directions: security (confidential computing), efficiency (GPU disaggregation), lightweight execution (Wasm/unikernels), and automation (KubeVirt/GPU Operator).
The explosive growth of AI/ML workloads makes GPU virtualization and disaggregation technologies increasingly important. In the Kubernetes ecosystem, the combination of KubeVirt and GPU Operator represents a key technology stack for preparing for this future.
Quiz: Future Virtualization Technology Knowledge Check
Q1. What data state does confidential computing protect?
A) At rest B) In transit C) In use D) At deletion
Answer: C) Confidential computing encrypts data in use at the hardware level, which was previously difficult to protect.
Q2. What is the key interconnect technology for GPU disaggregation?
A) USB 4.0 B) CXL (Compute Express Link) C) Thunderbolt 5 D) SATA Express
Answer: B) CXL provides low-latency, high-bandwidth connections between CPUs, GPUs, and memory, making it the key technology for GPU disaggregation.
Q3. What is the core feature of Kata Containers?
A) Sharing GPUs over the network B) Providing a WebAssembly runtime C) Running each container in a lightweight VM for isolation D) Ultra-lightweight OS based on unikernels
Answer: C) Kata Containers runs containers inside lightweight VMs, providing both container convenience and VM-level isolation.
Q4. Which virtualization technology is most suitable for edge computing?
A) Standard VM (full OS) B) VMware ESXi C) WebAssembly D) VirtualBox
Answer: C) WebAssembly is most suitable for resource-constrained edge environments with microsecond startup times and KB-level memory usage.