Split View: Live Migration 1: migration CRD에서 target Pod 생성까지

Live Migration 1: migration CRD에서 target Pod 생성까지

들어가며
migration은 API 액션이 아니라 작업 객체다
migration controller가 보는 것
priority queue를 쓰는 이유
target Pod는 어떻게 만들어지는가
왜 timeout이 এত 많을까
migration policy와 cluster config는 어디서 반영되는가
storage와 quota가 migration control plane에 들어오는 이유
- storage
- quota
handoff는 언제 일어나는가
자동 migration 요구와 동적 네트워크 변화
자주 하는 오해
운영자가 먼저 확인할 것
마무리

들어가며

Live migration을 이해할 때 많은 사람이 곧바로 pre-copy나 dirty page로 들어간다. 하지만 그 전에 더 중요한 질문이 있다. 누가 target Pod를 만들고, 언제 source VM에 handoff를 걸며, 어떤 정책으로 migration 시작을 허용하는가? 이건 data plane이 아니라 control plane 문제다.

이번 글은 pkg/virt-controller/watch/migration/migration.go를 중심으로, migration CR이 생긴 뒤 target Pod가 준비되기까지의 orchestration을 본다.

migration은 API 액션이 아니라 작업 객체다

앞 글에서 보았듯, 사용자가 migrate를 요청하면 virt-api는 직접 옮기지 않는다. 대신 VirtualMachineInstanceMigration 객체를 만든다.

그 다음부터는 virt-controller의 migration controller가 이 객체를 감시한다. 이 설계의 장점은 명확하다.

작업이 독립적으로 추적된다
controller가 비동기적으로 처리한다
실패, pending, abort를 별도 lifecycle로 다룰 수 있다

즉 migration은 "함수 호출"이 아니라 Kubernetes 오브젝트 기반 orchestration이다.

migration controller가 보는 것

NewController 시그니처를 보면 migration controller는 꽤 많은 informer와 store를 가진다.

VMI informer
Pod informer
migration informer
node store
PVC store
storage class store
storage profile store
migration policy store
resource quota informer
KubeVirt CR store

왜 이렇게 많을까? live migration은 단순히 source와 target만 맞으면 되는 작업이 아니기 때문이다.

target Pod가 스케줄 가능한가
storage가 target에서도 보이는가
cluster 차원의 concurrent migration 제한은 어떤가
migration policy가 어떤 timeout과 bandwidth를 요구하는가
resource quota가 target Pod 생성을 막지 않는가

즉 migration controller는 정책과 용량의 중앙 판단기다.

priority queue를 쓰는 이유

흥미로운 점은 migration controller가 일반 workqueue가 아니라 priority queue를 쓴다는 것이다. 주석을 보면 active migration에 더 높은 우선순위를 주고, capacity 때문에 대기 중인 migration이 active migration 처리까지 지연시키지 않게 하려는 의도가 보인다.

이건 중요한 설계 포인트다. migration은 한번 시작되면 지속적으로 상태를 추적해야 하므로, pending migration과 active migration을 똑같이 큐잉하면 좋지 않다.

target Pod는 어떻게 만들어지는가

migration controller 역시 직접 raw Pod spec를 조립하지 않고 template service를 사용한다. 인터페이스에 RenderMigrationManifest가 있는 것을 보면, migration target Pod는 일반 launcher Pod와 닮았지만 별도의 렌더링 경로를 가진다.

즉 controller 흐름은 다음과 같다.

migration CR 감지
해당 VMI의 migration 가능성 검토
target Pod가 필요하면 migration용 launcher manifest 렌더링
target Pod 생성
source와 target 상태를 추적하며 handoff

즉 migration의 첫 단계는 "메모리를 복사한다"가 아니라 새 launcher Pod를 올릴 수 있느냐다.

왜 timeout이 এত 많을까

migration.go에는 여러 timeout 상수가 보인다.

unschedulable pending timeout
catch-all pending timeout

이건 target Pod가 단순히 늦는 것과, 사실상 불가능한 상태를 구분하기 위해서다.

예를 들어:

노드가 잠깐 부족한 경우
특정 resource가 영원히 맞지 않는 경우
PVC attach가 오래 걸리는 경우

이들을 모두 같은 실패로 처리하면 운영자가 원인을 파악하기 어렵다. 그래서 controller는 pending의 의미를 세분화한다.

migration policy와 cluster config는 어디서 반영되는가

migrationPolicyStore, clusterConfig가 controller에 들어 있는 이유는 migration이 VMI마다 같은 정책으로 돌지 않기 때문이다.

대표적으로 정책은 다음에 영향을 준다.

동시 migration 개수
node별 outbound migration 개수
bandwidth 제한
progress timeout
completion timeout
post-copy 허용 여부
TLS 비활성화 여부

즉 "migration을 한다"는 말 뒤에는 항상 정책 레이어가 숨어 있다.

storage와 quota가 migration control plane에 들어오는 이유

운영자 입장에서는 network와 CPU만 생각하기 쉽지만, target Pod를 실제로 띄우려면 storage와 quota도 중요하다.

storage

target node에서 필요한 volume이 준비되지 않으면 migration은 시작도 못 한다.

quota

target launcher Pod를 새로 생성해야 하므로 namespace resource quota에 걸릴 수 있다.

즉 migration control plane은 guest memory copying보다 훨씬 더 Kubernetes스러운 문제들을 먼저 해결해야 한다.

handoff는 언제 일어나는가

migration.go에는 handOffMap 같은 구조도 보인다. 이는 source 쪽에서 virt-handler로 넘기는 순간을 관리하기 위한 장치다. 쉽게 말하면 controller는 영원히 모든 것을 붙잡고 있지 않는다. 일정 단계가 지나면 node-local execution plane이 더 큰 역할을 맡는다.

이 분리가 필요한 이유는 다음과 같다.

controller는 cluster-wide 상태 판단에 강하다
실제 migration 수행은 source와 target 노드의 virt-handler와 launcher가 더 잘 안다

즉 live migration은 control plane과 execution plane이 단계적으로 책임을 나누는 구조다.

자동 migration 요구와 동적 네트워크 변화

pkg/network/migration/evaluator.go는 또 다른 흥미로운 사실을 보여준다. secondary network hotplug나 unplug, NAD 변경은 migration required condition을 만들 수 있다.

이 말은 migration이 꼭 사용자가 "지금 옮겨"라고 해서만 일어나는 것이 아니라, 현재 Pod에서 안전하게 반영하기 어려운 네트워크 변화가 있을 때 재배치 수단으로도 사용될 수 있다는 뜻이다.

즉 migration control plane은 maintenance 기능이면서 configuration convergence 메커니즘이기도 하다.

자주 하는 오해

오해 1: migration은 source QEMU가 target QEMU와 직접 이야기하면 끝난다

아니다. 그 전에 migration CR, target Pod 생성, 스케줄링, 정책, quota, storage 준비가 있다.

오해 2: migration target Pod는 그냥 기존 Pod를 재사용한다

아니다. 별도의 target launcher Pod가 새로 만들어질 수 있다.

오해 3: migration 실패는 거의 data plane 문제다

아니다. 실제 운영에서는 control plane에서 pending, unschedulable, quota, policy mismatch로 막히는 경우도 많다.

운영자가 먼저 확인할 것

migration CR이 생성되었는가
target Pod가 생겼는가
target Pod가 pending인지 running인지
migration policy와 cluster-wide concurrency 제한은 어떤가
namespace quota와 storage 가시성은 괜찮은가

이 단계가 통과되어야 다음 글의 pre-copy와 post-copy가 의미를 갖는다.

마무리

KubeVirt live migration의 control plane은 migration CR 생성에서 시작해, virt-controller가 policy, quota, storage, scheduling 상태를 보고 target Pod를 준비하는 과정으로 이어진다. 즉 migration은 libvirt data transfer 이전에 이미 Kubernetes orchestration 문제다. 이 관점을 잡아야 실제 운영에서 왜 migration이 시작도 못 하고 pending에서 멈추는지 이해할 수 있다.

다음 글에서는 이 control plane 준비가 끝난 뒤 실제 메모리와 디스크 상태를 옮기는 pre-copy, post-copy, dirty page 모델을 설명하겠다.

Live Migration 1: From Migration CRD to Target Pod Creation

Introduction
Migration Is a Work Object, Not an API Action
What the Migration Controller Watches
Why a Priority Queue Is Used
How Is the Target Pod Created?
Why Are There So Many Timeouts?
Where Are Migration Policy and Cluster Config Reflected?
Why Storage and Quota Enter the Migration Control Plane
- Storage
- Quota
When Does Handoff Occur?
Automatic Migration Requests and Dynamic Network Changes
Common Misconceptions
What Operators Should Check First
Conclusion

Introduction

When understanding live migration, many people jump straight to pre-copy or dirty pages. But there is a more important question first. Who creates the target Pod, when is the handoff triggered on the source VM, and what policies govern whether migration can begin? This is a control plane problem, not a data plane one.

This post focuses on pkg/virt-controller/watch/migration/migration.go to examine the orchestration from when a migration CR is created until the target Pod is ready.

Migration Is a Work Object, Not an API Action

As seen in the previous post, when a user requests migration, virt-api does not perform the move directly. Instead, it creates a VirtualMachineInstanceMigration object.

From that point on, the migration controller in virt-controller watches this object. The advantages of this design are clear.

The work is tracked independently
The controller processes it asynchronously
Failure, pending, and abort can be handled as separate lifecycles

In other words, migration is not a "function call" but Kubernetes object-based orchestration.

What the Migration Controller Watches

Looking at the NewController signature, the migration controller has quite a few informers and stores.

VMI informer
Pod informer
Migration informer
Node store
PVC store
Storage class store
Storage profile store
Migration policy store
Resource quota informer
KubeVirt CR store

Why so many? Because live migration is not a task that works just by matching source and target.

Can the target Pod be scheduled?
Is storage visible from the target?
What is the cluster-wide concurrent migration limit?
What timeout and bandwidth does the migration policy require?
Will resource quota block target Pod creation?

In other words, the migration controller is the central arbiter of policy and capacity.

Why a Priority Queue Is Used

An interesting point is that the migration controller uses a priority queue rather than a regular workqueue. Comments indicate the intent is to give higher priority to active migrations so that capacity-waiting migrations do not delay active migration processing.

This is an important design point. Once a migration starts, it must continuously track state, so queuing pending and active migrations equally is not ideal.

How Is the Target Pod Created?

The migration controller also does not assemble raw Pod specs directly -- it uses a template service. The existence of RenderMigrationManifest in the interface shows that the migration target Pod resembles a regular launcher Pod but has a separate rendering path.

So the controller flow is as follows:

Detect migration CR
Check the VMI's migration feasibility
Render the migration launcher manifest if a target Pod is needed
Create the target Pod
Track source and target state for handoff

In other words, the first step of migration is not "copy memory" but whether a new launcher Pod can be created.

Why Are There So Many Timeouts?

In migration.go, several timeout constants are visible.

Unschedulable pending timeout
Catch-all pending timeout

These exist to distinguish between a target Pod that is simply slow and one that is in a practically impossible state.

For example:

When nodes are temporarily insufficient
When a specific resource will never match
When PVC attach takes a long time

Treating all of these as the same failure makes it difficult for operators to identify the cause. So the controller subdivides the meaning of pending.

Where Are Migration Policy and Cluster Config Reflected?

The reason migrationPolicyStore and clusterConfig are in the controller is that migration does not run with the same policy for every VMI.

Notably, policies affect the following:

Number of concurrent migrations
Per-node outbound migration count
Bandwidth limits
Progress timeout
Completion timeout
Post-copy allowance
TLS disable option

In other words, behind the words "perform a migration," there is always a policy layer.

Why Storage and Quota Enter the Migration Control Plane

From an operator's perspective, it is easy to think only about network and CPU, but storage and quota are also important for actually creating the target Pod.

Storage

If required volumes are not prepared on the target node, migration cannot even begin.

Quota

Since a new target launcher Pod must be created, it can be blocked by namespace resource quota.

In other words, the migration control plane must solve Kubernetes-centric problems far more than guest memory copying.

When Does Handoff Occur?

In migration.go, structures like handOffMap are visible. This is a mechanism for managing the moment of handing off to virt-handler on the source side. Simply put, the controller does not hold on to everything forever. After a certain stage, the node-local execution plane takes on a larger role.

The reason for this separation:

The controller is strong at cluster-wide state decisions
Actual migration execution is better handled by virt-handler and the launcher on source and target nodes

In other words, live migration is a structure where the control plane and execution plane progressively share responsibility.

Automatic Migration Requests and Dynamic Network Changes

pkg/network/migration/evaluator.go reveals another interesting fact. Secondary network hotplug or unplug, and NAD changes, can create a migration required condition.

This means migration does not only happen because a user says "move it now." It can also be used as a re-placement mechanism when network changes are difficult to safely apply on the current Pod.

In other words, the migration control plane is both a maintenance feature and a configuration convergence mechanism.

Common Misconceptions

Misconception 1: Migration ends when source QEMU talks directly to target QEMU

No. Before that, there is migration CR, target Pod creation, scheduling, policy, quota, and storage preparation.

Misconception 2: The migration target Pod just reuses the existing Pod

No. A separate target launcher Pod may be newly created.

Misconception 3: Migration failures are mostly data plane issues

No. In actual operations, many cases are blocked in the control plane by pending, unschedulable, quota, or policy mismatch.

What Operators Should Check First

Has the migration CR been created?
Has the target Pod been created?
Is the target Pod pending or running?
What are the migration policy and cluster-wide concurrency limits?
Are namespace quota and storage visibility acceptable?

These steps must pass before the pre-copy and post-copy in the next post become meaningful.

Conclusion

The control plane of KubeVirt live migration starts with migration CR creation and continues with virt-controller checking policy, quota, storage, and scheduling state to prepare the target Pod. In other words, migration is already a Kubernetes orchestration problem before the libvirt data transfer begins. Grasping this perspective is essential for understanding why migrations get stuck at pending in actual operations.

In the next post, we will explain the pre-copy, post-copy, and dirty page models that transfer the actual memory and disk state after this control plane preparation is complete.