CPU, Memory, NUMA, HugePages: How KubeVirt Aligns the Resource Model

Introduction
Why Guest Resources and Pod Resources Differ
The API Schema Already Reveals This Problem
Who Calculates Launcher Pod Resources?
Why Is Memory Overhead Important?
Why Can CPU Topology Differ from Guest Numbers?
Why Is NUMA in the API?
What Changes with HugePages?
How Are Migration and the Resource Model Connected?
Common Misconceptions
What Operators Should Look at First
Conclusion

Introduction

The VM resource model is more demanding than that of containers. A user might want to give 4 vCPUs and 16 GiB of memory to a guest, but in practice, the launcher Pod, QEMU overhead, emulator threads, hugepages, and NUMA locality must all be considered together. KubeVirt bridges this gap by managing both the guest resource model and the Pod resource model simultaneously.

This post focuses on staging/src/kubevirt.io/api/core/v1/schema.go, pkg/virt-controller/services/template.go, and pkg/virt-launcher/virtwrap/manager.go to examine this resource translation layer.

Why Guest Resources and Pod Resources Differ

For containers, the resources consumed by a process are typically the same as the Pod resources. But VMs are different.

Memory as seen by the guest
Additional memory consumed by QEMU and virtualization infrastructure
I/O threads and emulator threads
Page table, device emulation, virtio queue overhead

Because of these factors, the launcher Pod may need to request more memory than the guest memory.

In other words, if KubeVirt does not handle this, the scheduler will place VMs too optimistically.

The API Schema Already Reveals This Problem

Looking at schema.go, the CPU and memory related fields are quite rich.

CPU
CPUTopology
NUMA
Hugepages
MemoryOverhead

This means KubeVirt does not settle for a simple abstraction of "give me X CPUs." It is a system that aims to address execution performance and placement stability through the API.

Who Calculates Launcher Pod Resources?

This role is primarily handled by pkg/virt-controller/services/template.go. Here, CalculateMemoryOverhead is called, and the actual resource requests and limits needed for the launcher Pod are generated.

The key points are:

It does not reflect only guest memory
It adds virtualization infrastructure overhead
It can also account for additional memory required by network binding plugins
Depending on whether hugepages are used, the type of Pod resource itself changes

In other words, the VMI spec does not directly become a Pod spec. There is a resource adjustment stage in between.

Why Is Memory Overhead Important?

schema.go has a MemoryOverhead description, and template.go also handles memory overhead through annotations and status. The migration state even has a separate target memory overhead.

This is very important. For example, if a guest sees only 8 GiB and the launcher Pod requests only 8 GiB:

It becomes vulnerable to node pressure
QEMU or auxiliary threads may hit OOM
Resource calculations may also be off on the migration target

In other words, KubeVirt views "guest memory" and "launcher envelope memory" as separate things.

Why Can CPU Topology Differ from Guest Numbers?

KubeVirt's CPUTopology represents sockets, cores, and threads. However, what the Kubernetes scheduler sees is ultimately the launcher Pod's CPU request and limit.

The important case here is dedicated CPU. When dedicated CPU is requested:

CPU pinning is needed
The launcher Pod requires stricter resource guarantees
The migration target must also find a node with a suitable CPU topology

Looking at UpdateVCPUs in manager.go, when dedicated CPU is active, it reads the domain spec and pod cpuset to call PinVcpuFlags and PinEmulator. This is not a simple quota issue but a pCPU placement problem.

Why Is NUMA in the API?

The NUMA and NUMAGuestMappingPassthrough descriptions in schema.go are highly significant. KubeVirt attempts to model the guest NUMA topology to be compatible with host CPU pinning.

The reason this matters is performance.

When NUMA locality matches, memory access latency decreases
If CPU and memory are scattered across different NUMA nodes, performance can fluctuate
Combined with device passthrough, this becomes even more sensitive

In other words, KubeVirt treats NUMA not as an "advanced option" but as an essential topology constraint for high-performance VM operations.

What Changes with HugePages?

When HugePages are enabled, memory is treated as a different resource class than regular pages. schema.go and template.go also reflect the hugepages page size as a Pod resource.

This means:

Guest memory policy directly affects the Pod scheduling resource type
If the node does not have the corresponding hugepage pool, scheduling itself may fail
Free page reporting and some memory feature behaviors may also change

In other words, hugepages are not a "performance checkbox" but a choice that changes the entire scheduling and kernel memory model.

How Are Migration and the Resource Model Connected?

During migration, the target node must accommodate the same guest as the source. However, with dedicated CPU, NUMA, hugepages, and memory overhead, the target conditions become much more demanding.

In practice, the migration status includes:

Target node topology
Target memory overhead

This shows that migration is not simply a "move to any empty node" operation, but rather a task of moving to a node that can maintain the same performance characteristics.

Common Misconceptions

Misconception 1: If the guest has 8 GiB, 8 GiB for the Pod is sufficient

No. There is virtualization overhead, and binding plugins or auxiliary features can consume additional memory.

Misconception 2: If CPU requests match, dedicated CPU is fine

No. Pinning, topology, and cpuset must all align.

Misconception 3: NUMA and hugepages are just performance tuning options

No. They also change scheduling conditions and migration feasibility.

What Operators Should Look at First

Distinguish between guest memory and launcher memory overhead.
If dedicated CPU is used, check cpuset and pinning paths.
If hugepages are requested, verify the node's hugepage pool first.
If migration fails, check the target node topology and target memory overhead.

Conclusion

KubeVirt does not simply pass VMI CPU and memory requests as Pod requests. Instead, it adds memory overhead and reflects dedicated CPU, NUMA, and hugepages to align both the launcher Pod and guest hardware model. Thanks to this structure, the Kubernetes scheduler can place VMs reasonably correctly, and guests get more predictable performance characteristics.

In the next post, we will look at the host primitives that make this resource model possible: kernel technologies such as /dev/kvm, namespaces, cgroups, TAP, and netlink.