Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction

The most intuitive question from a user's perspective is: "Where does a VM get its IP? Does CNI give the IP directly to the VM?" To understand KubeVirt networking, you need to be able to give a brief answer first.

**By default, CNI first gives the network to the `virt-launcher` Pod. KubeVirt then configures TAP, bridge, DHCP, and NAT inside that Pod's network namespace to connect the guest NIC.**

In other words, the guest typically does not call CNI directly. KubeVirt **rewires** the network that CNI prepared for the Pod.

Why This Architecture Was Chosen

KubeVirt's design philosophy is to maximize reuse of Kubernetes and CNI rather than creating new networking. `docs/components.md` and `docs/network/libvirt-pod-networking.md` explain this quite clearly.

With this architecture:

- Pod scheduling and network attach are handled by Kubernetes as-is

- KubeVirt only needs to handle NIC wiring from the guest's perspective

- Compatibility with Services, NetworkPolicy, and the Pod network ecosystem improves

The core of KubeVirt networking is not a "VM-dedicated network system" but rather **guest-side adaptation of Pod networking**.

What Happens First: The Pod Receives a Network

When the `virt-launcher` Pod is created, CNI first attaches the Pod network. The default primary interface typically appears as `eth0`. Up to this point, it's not much different from a regular Pod.

What's important is that **KubeVirt has not yet created the guest NIC at this point**. First the Pod has its network, then KubeVirt adds guest-specific configuration inside that network namespace.

What `pkg/network/setup` Does

The core code is gathered in `pkg/network/setup/network.go`, `podnic.go`, `netpod/*`, `dhcp/*`, and `link/*`.

The flow visible here is:

1. Read the VMI's network spec and interface spec.

2. Find the actual interface name from the Pod.

3. Determine wiring based on binding method: bridge, masquerade, passt, SR-IOV, etc.

4. Create TAP devices if needed and generate the libvirt domain NIC spec.

5. Start the KubeVirt internal DHCP server if needed.

Network setup is the stage of preparing "how this VM's NIC should appear to the guest" inside the Pod namespace.

Why TAP Is Needed

The guest OS sends and receives packets through a virtual NIC. On the QEMU side, a TAP device is commonly used as the backend. Looking at `cmd/virt-chroot/tap-device-maker.go` and `pkg/network/setup/podnic.go`, it becomes clear that KubeVirt uses TAP device creation as an important primitive.

Simplified:

- The Pod-side interface or bridge serves as the host-side endpoint

- TAP serves as the QEMU-side endpoint

- The libvirt domain spec contains the configuration information connecting the two

TAP is one of the practical contact points between the guest and Pod network.

Why KubeVirt Runs DHCP Itself

In many cases, the guest does not talk directly to CNI. Instead, KubeVirt tells the guest its network settings. Looking at `pkg/network/setup/podnic.go`, `newDHCPConfigurator` creates a DHCP configurator for bridge or masquerade bindings, and calls `EnsureDHCPServerStarted`.

This means:

- Pod interface information is on the host side

- KubeVirt calculates the IP and gateway the guest should receive

- Delivers it to the guest via DHCP

Who Assigns the Guest IP

This needs to be distinguished by binding method.

Masquerade Binding

Looking at `pkg/network/dhcp/masquerade.go` and `pkg/network/link/address.go`, KubeVirt calculates the gateway and guest IP from an internal CIDR for the guest. For IPv4, if `VMNetworkCIDR` is absent, a default CIDR is used, and gateway and guest IP are determined from it.

In masquerade:

- Pod IP is still assigned by CNI to the Pod

- Guest IP is calculated by KubeVirt from an internal CIDR

- KubeVirt DHCP delivers it to the guest

- Some egress and inbound traffic is handled via NAT

This is the most common "VM gets a private IP and hides behind the Pod IP" model.

Bridge Binding

Bridge binding attaches the guest more directly to the Pod network. Here, bridge and TAP are used to make the guest act as an endpoint of the Pod network. The core is that the guest has more **direct L2 or Pod network connectivity**.

The key of bridge binding is not "hiding the Pod IP behind NAT" but rather **exposing the guest more directly to the network**.

How NAT Is Implemented in Masquerade Binding

Looking at `pkg/network/setup/netpod/masquerade/masquerade.go`, KubeVirt configures `nftables`-based NAT rules for masquerade.

The important flow is:

- Create NAT table and chains

- Add masquerade rules for guest source address

- Add port forwarding or DNAT rules

- Handle loopback and migration-related exception ports

Masquerade is not simply "one bridge" but a combination of:

- Internal guest address calculation

- DHCP

- nftables NAT

- DNAT or SNAT for required ports

Why Link and Interface Status Matter

KubeVirt must track which Pod interface each guest NIC is connected to. So `pkg/network/controllers/vmi.go` reads Multus network status from Pod annotations and updates the VMI status `Interfaces` field.

This status contains:

- Guest-side logical network name

- Pod interface name

- Whether the info source is Pod status, domain, or guest agent

This is very important for network debugging.

Operational Advantages of This Architecture

1. Does Not Significantly Break the Kubernetes Network Model

Pods are still attached through CNI.

2. Guest Configuration Can Be Controlled

Since KubeVirt directly handles DHCP and bridge or NAT, the guest-side experience can be made consistent.

3. Per-Binding Trade-offs Can Be Chosen

Bindings can be selected based on performance, reachability, service exposure, and migration suitability.

Common Misconceptions

Misconception 1: VMs Get IP Directly from CNI

By default, the Pod gets it first, and the guest is connected by KubeVirt on top.

Misconception 2: Pod IP and Guest IP Are Always the Same

No. Especially in masquerade, they differ. Pod IP is handled by CNI, guest IP by KubeVirt DHCP.

Misconception 3: KubeVirt Networking Is Done by libvirt Internal Features Alone

No. Pod namespace manipulation, TAP, DHCP, nftables, and netlink are all needed together.

Debugging Checklist

- Check Pod IP existence and guest IP existence separately.

- For masquerade, check guest CIDR, DHCP, nftables rules.

- For bridge, check Pod interface, bridge, TAP, guest DHCP path.

- Check whether VMI status `Interfaces` match the actual interface names on the launcher Pod.

Conclusion

The core of KubeVirt networking is not the VM using CNI directly, but rewiring the network the Pod received first to be guest-friendly. So IP assignment also differs by binding. CNI can give the Pod IP, but KubeVirt's DHCP and internal bridge or NAT logic can determine the guest IP. Understanding this basic principle makes it much clearer to understand the differences between bridge, masquerade, passt, and SR-IOV bindings in the next article.

In the next article, we will compare the characteristics and trade-offs of those bindings.