- Published on
VMI Status, Metrics, Guest Agent, Debugging: How KubeVirt Exposes Internal State
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- 1. VMI Status Is the Most Important Operational Surface
- 2. Phase Alone Is Insufficient -- Conditions Must Be Checked Together
- 3. activePods Is Especially Important During Migration
- 4. Network Status Combines Pod Annotation and Guest Information
- 5. Guest Agent Is the Window into Guest Internal Information
- 6. Domain Stats Is the Middle Layer Between Host and Guest Observability
- 7. Prometheus Metrics Largely Come from virt-handler
- 8. What Is Reduced When Guest Agent Is Absent
- 9. Debugging Must Separate Control Plane, Node, and Guest
- 10. Status Does Not Always Immediately Reflect Reality
- Key Points for Operators
- Conclusion
Introduction
When operating KubeVirt, the hardest question is this: "How far is this VM really alive right now?" The Pod may be Running but the guest may have stopped. The guest may be alive but migration may be on the verge of failure. That is why KubeVirt collects state from multiple layers, not just one.
- Kubernetes object state
- libvirt domain state
- Guest internal information reported by the guest agent
- Network status and migration status
- Prometheus metrics
This post examines how these observation layers connect.
1. VMI Status Is the Most Important Operational Surface
Looking at VirtualMachineInstanceStatus in staging/src/kubevirt.io/api/core/v1/types.go, quite a lot of information operators want to see is included.
phaseconditionsinterfacesguestOSInfomigrationStateqosClassactivePodsselinuxContextmemorycurrentCPUTopology
Reading just this type reveals that KubeVirt does not view state as simply "on or off." VM state is the combined result of Kubernetes phase, guest internal information, migration progress, and network interface status.
2. Phase Alone Is Insufficient -- Conditions Must Be Checked Together
phase summarizes the high-level flow. Values like Pending, Scheduling, Scheduled, Running, Succeeded, Failed, and Unknown show the general direction.
But actual operational decisions come from conditions, reason, and message. The API types predefine conditions and reasons such as:
LiveMigratableStorageLiveMigratableMigrationRequiredEvictionRequestedDataVolumesReadyDisksNotLiveMigratableInterfaceNotLiveMigratableHostDeviceNotLiveMigratableSEVNotLiveMigratableSecureExecutionNotLiveMigratable
KubeVirt does not just say "it cannot be done" -- it has standardized in the type system why live migration is not possible.
3. activePods Is Especially Important During Migration
VirtualMachineInstanceStatus.ActivePods is a mapping of pod UIDs to node names. As noted in the comments, during migration, multiple Pods can be associated with a single VMI simultaneously.
This field is important for reading "which virt-launcher Pod is currently the source and which is the target." In practice, migration timing confusion almost always starts here. What you thought was a single VM has a brief window where both source and target launchers exist simultaneously from the control plane's perspective.
In other words, activePods is a hidden key field in migration debugging.
4. Network Status Combines Pod Annotation and Guest Information
Looking at pkg/network/controllers/vmi.go, VMI status interfaces do not come from just one source.
- Pod Multus network status is read for pod interface names
- Primary and secondary interfaces are calculated
- Existing status entries not in the spec are also preserved
The API type VirtualMachineInstanceNetworkInterface contains:
- Guest IP
- MAC
- Network name
- Pod interface name
- VM internal interface name
- Info source
In particular, infoSource distinguishes whether information came from the guest-agent, domain, or multus-status. Thanks to this design, operators can determine "whether this IP is a value reported from inside the guest or a value reported by CNI."
5. Guest Agent Is the Window into Guest Internal Information
Looking at the DomainManager interface in pkg/virt-launcher/virtwrap/manager.go, there are quite a few guest-related methods.
GetGuestInfoGetUsersGetFilesystemsGetGuestOSInfoGuestPing
This is an important signal. KubeVirt considers libvirt and QEMU level state alone insufficient, and separately collects information from inside the OS via the guest agent.
pkg/virt-handler/rest/lifecycle.go receives this data through the launcher client and exposes it as API responses. In other words, the guest information operators see ultimately passes through:
- virt-handler REST endpoint
- Launcher client RPC
- virt-launcher internal domain manager
- QEMU guest agent
6. Domain Stats Is the Middle Layer Between Host and Guest Observability
The same DomainManager interface also has GetDomainStats and GetDomainDirtyRateStats. This means it pulls domain-level statistics reported by libvirt separately from the guest agent.
This layer provides a lot of information visible even when the guest agent inside the guest does not respond.
- CPU usage
- Memory state
- Block I/O
- Network traffic
- Dirty page rate
In other words, the guest agent tells you the meaning inside the guest, while domain stats tells you the execution facts observed by the hypervisor. They are not competitors but complementary.
7. Prometheus Metrics Largely Come from virt-handler
Looking at pkg/monitoring/metrics/virt-handler/domainstats, there are collectors that convert domain statistics like CPU, memory, block, and vcpu into Prometheus metrics.
This structure is quite practical.
- The closest point to the actual VM process is the node.
- Collecting domain stats is easiest from the node.
- So metrics export is also attached close to virt-handler.
In other words, KubeVirt observability is closer to a structure where the node-local agent collects more execution facts than the central controller.
8. What Is Reduced When Guest Agent Is Absent
The VM does not fail to start without a guest agent. But the meaningful information available to operators is significantly reduced.
- Guest internal user list
- Filesystem list
- OS pretty name
- Interface names and some guest IP information
In other words, the guest agent is not a required boot dependency but an extension layer that enriches operational visibility and automation.
Therefore, in situations where "Pod is normal but VM internals are not visible," guest agent installation and connection status should be suspected first.
9. Debugging Must Separate Control Plane, Node, and Guest
The most common mistake when looking at KubeVirt problems is mixing layers. It is better to split them as follows.
What to look at in the control plane
- VMI
phase conditionsmigrationStateactivePods- Events and migration CR status
What to look at on the node
- virt-handler logs
- virt-launcher logs
- libvirt domain state
- Domain stats
- Pod network and TAP state
What to look at in the guest
- QEMU guest agent response status
- Guest OS info
- Users
- Filesystems
- Actual service health
In other words, KubeVirt debugging is ultimately the work of distinguishing "which layer's truth am I looking at?"
10. Status Does Not Always Immediately Reflect Reality
As explicitly noted in the API type comments, VirtualMachineInstanceStatus can lag behind the actual system state. This is a very important operational point.
Because status is updated through informers, controllers, launcher, libvirt, and guest agent, in very brief moments:
- The Pod may have already changed but status is delayed
- The migration target is up but
phasestill has the old value - The guest agent is dead but the domain shows Running
In other words, KubeVirt is a system that requires combining multiple observation surfaces for judgment rather than strong consistency.
Key Points for Operators
phasealone is insufficient.conditions,reason, andmigrationStatemust be checked together.activePodsis important for reading source and target Pods during migration.- Network status is the combined result of Multus, domain, and guest-agent information.
- Guest agent and domain stats are not substitutes but complements.
Conclusion
KubeVirt's observability is built by combining information from multiple layers, not a single state value. VMI status shows the current state from a Kubernetes resource perspective, the guest agent reveals meaning inside the guest, and domain stats with Prometheus metrics allow observing the actual execution data plane. Therefore, operating KubeVirt is less about asking "is the VM up" and more about distinguishing "which signal broke at which layer."
In the next post, we will use this observation model to organize actual failure modes such as drain, eviction, migration failure, and non-migratable conditions.