Split View: [가상화] 04. GPU 가상화 기술 총정리: Passthrough, vGPU, MIG, SR-IOV

[가상화] 04. GPU 가상화 기술 총정리: Passthrough, vGPU, MIG, SR-IOV

들어가며
GPU Passthrough (VFIO-PCI)
vGPU (NVIDIA GRID)
MIG (Multi-Instance GPU)
SR-IOV for GPUs
- PCIe 수준의 가상화
Kubernetes에서의 GPU 타임슬라이싱
종합 비교표
선택 가이드

들어가며

GPU 가상화는 AI/ML 워크로드와 클라우드 컴퓨팅의 폭발적 성장으로 핵심 인프라 기술이 되었습니다. 하나의 물리 GPU를 여러 VM이나 컨테이너에서 공유하거나, 반대로 VM에 GPU를 독점 할당하는 다양한 방식이 있습니다.

[GPU 가상화 기술 스펙트럼]

독점 할당 <-----------------------------------> 최대 공유

GPU Passthrough    vGPU (SR-IOV)    MIG    vGPU (mdev)    Time-Slicing
 (1 GPU : 1 VM)   (HW 파티셔닝)  (공간분할) (SW 시분할)    (K8s 공유)

GPU Passthrough (VFIO-PCI)

개요

GPU Passthrough는 물리 GPU를 VM에 직접 연결하는 가장 단순하고 강력한 방식입니다.

+------------------+     +------------------+
|    VM 1          |     |    VM 2          |
|  GPU Driver      |     |  GPU Driver      |
|  (Full Access)   |     |  (Full Access)   |
+--------+---------+     +--------+---------+
         |                         |
    IOMMU/VT-d                IOMMU/VT-d
         |                         |
+--------+---------+     +--------+---------+
|  Physical GPU 1  |     |  Physical GPU 2  |
|  (Exclusive)     |     |  (Exclusive)     |
+------------------+     +------------------+

특징

항목	내용
성능	네이티브의 95~99%
GPU 공유	불가 (1 GPU = 1 VM)
지원 GPU	모든 GPU (NVIDIA, AMD, Intel)
라이선스	추가 라이선스 불필요
필수 요건	IOMMU (VT-d/AMD-Vi) 지원 CPU/마더보드
드라이버	게스트 OS에 일반 GPU 드라이버 설치
제한 사항	GPU당 하나의 VM만 사용 가능

설정 흐름

# 1. BIOS에서 VT-d/AMD-Vi(IOMMU) 활성화

# 2. 부트 파라미터에 IOMMU 추가
# /etc/default/grub:
# GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt"

# 3. IOMMU 그룹 확인
find /sys/kernel/iommu_groups/ -type l | sort -V

# 4. GPU를 vfio-pci 드라이버에 바인딩
# /etc/modprobe.d/vfio.conf:
# options vfio-pci ids=10de:2484,10de:228b

# 5. initramfs 재생성
update-initramfs -u

# 6. 재부팅 후 확인
lspci -nnk -s 01:00.0
# Kernel driver in use: vfio-pci

vGPU (NVIDIA GRID)

아키텍처

vGPU는 하나의 물리 GPU를 여러 VM에서 공유할 수 있게 합니다.

+--------+ +--------+ +--------+ +--------+
| VM 1   | | VM 2   | | VM 3   | | VM 4   |
| vGPU   | | vGPU   | | vGPU   | | vGPU   |
| 4GB    | | 4GB    | | 4GB    | | 4GB    |
+--------+ +--------+ +--------+ +--------+
|            vGPU Manager                  |
|        (Hypervisor Module)               |
+------------------------------------------+
|          Physical GPU (16GB)             |
|        (NVIDIA A100/L40S/etc.)          |
+------------------------------------------+

mdev (Mediated Devices) - Volta 이전

[소프트웨어 타임슬라이싱]

시간 -->  |  VM1  |  VM2  |  VM3  |  VM1  |  VM2  |  VM3  |
          +-------+-------+-------+-------+-------+-------+
GPU 전체   사용    사용    사용    사용    사용    사용

- GPU 전체를 시분할로 공유
- vGPU Manager가 컨텍스트 스위칭 관리
- VRAM은 정적으로 파티셔닝
- 컴퓨팅 자원은 시간 기반 공유

SR-IOV - Ampere 이후

[하드웨어 파티셔닝 - SR-IOV]

+-------------------------------------------+
| Physical Function (PF)                     |
| - GPU 관리, 설정                           |
+-------------------------------------------+
| VF 0        | VF 1        | VF 2          |
| (VM 1)      | (VM 2)      | (VM 3)        |
| 독립 큐     | 독립 큐     | 독립 큐        |
| 독립 인터럽트| 독립 인터럽트| 독립 인터럽트  |
+-------------------------------------------+
|          Physical GPU (PCIe)              |
+-------------------------------------------+

- PF (Physical Function): 호스트가 관리
- VF (Virtual Function): 각 VM에 직접 할당
- IOMMU가 VF 간 메모리 격리 보장
- 하드웨어 레벨 격리로 더 안정적

vGPU 프로파일 시리즈

시리즈	용도	VRAM 할당	대표 사용 사례
C-series	Compute	대용량	AI/ML 학습, HPC
Q-series	Quadro	중~대	CAD, 3D 렌더링, 전문 그래픽
B-series	Desktop	소~중	VDI, 일반 데스크톱
A-series	App Streaming	소~중	앱 가상화, 원격 워크스테이션

[NVIDIA A100 vGPU 프로파일 예시]

GPU: NVIDIA A100 40GB

프로파일          VRAM    최대 인스턴스
A100-1-5C         5GB      7
A100-2-10C       10GB      4
A100-4-20C       20GB      2
A100-8-40C       40GB      1
A100-1-5CME       5GB      7  (MIG-backed)
A100-2-10CME     10GB      4  (MIG-backed)

라이선스 요구사항:

NVIDIA vGPU Software License 필수 (vApps, vPC, vCS, vWS)
라이선스 서버(DLS 또는 CLS) 필요
GPU 모델과 프로파일에 따라 라이선스 유형이 결정

MIG (Multi-Instance GPU)

개요

MIG는 NVIDIA A100에서 처음 도입된 하드웨어 수준의 공간 분할 기술입니다.

[MIG 아키텍처 - A100 예시]

+-----------------------------------------------------------+
|                    NVIDIA A100 (80GB)                       |
+-----------------------------------------------------------+
| GPC 0-1  | GPC 2-3  | GPC 4   | GPC 5   | GPC 6          |
| MIG 1    | MIG 2    | MIG 3   | MIG 4   | MIG 5          |
| 2g.20gb  | 2g.20gb  | 1g.10gb | 1g.10gb | 1g.10gb        |
+----------+----------+---------+---------+-----------------+
| Memory   | Memory   | Memory  | Memory  | Memory          |
| Ctrl 0-1 | Ctrl 2-3 | Ctrl 4  | Ctrl 5  | Ctrl 6          |
| 20GB HBM | 20GB HBM | 10GB HBM| 10GB HBM| 10GB HBM       |
+----------+----------+---------+---------+-----------------+
| L2 Cache | L2 Cache | L2 $    | L2 $    | L2 $            |
| Slice 0-1| Slice 2-3| Slice 4 | Slice 5 | Slice 6         |
+-----------------------------------------------------------+

핵심 특징

하드웨어 공간 분할: 최대 7개의 독립 인스턴스
전용 리소스: 각 인스턴스에 SM, 메모리 컨트롤러, L2 캐시, HBM이 독립 할당
완전한 격리: 인스턴스 간 리소스 경합 없음
에러 격리: 하나의 인스턴스 오류가 다른 인스턴스에 영향 없음

지원 GPU

GPU	최대 인스턴스	HBM	지원 아키텍처
A100 40GB	7	40GB HBM2e	Ampere
A100 80GB	7	80GB HBM2e	Ampere
A30	4	24GB HBM2e	Ampere
H100	7	80GB HBM3	Hopper
H200	7	141GB HBM3e	Hopper

MIG 설정 예시

# MIG 모드 활성화 (재부팅 필요)
sudo nvidia-smi -i 0 --mig 1

# 재부팅 후 MIG 인스턴스 생성
# 프로파일 목록 확인
nvidia-smi mig -lgip

# 3g.40gb 인스턴스 2개 생성 (A100 80GB 기준)
sudo nvidia-smi mig -cgi 9,9 -C

# 생성된 인스턴스 확인
nvidia-smi mig -lgi
nvidia-smi mig -lci

# MIG 인스턴스 삭제
sudo nvidia-smi mig -dci
sudo nvidia-smi mig -dgi

# MIG 모드 비활성화 (재부팅 필요)
sudo nvidia-smi -i 0 --mig 0

MIG-backed vGPU

MIG와 vGPU를 결합하면 하드웨어 격리와 VM 지원을 동시에 제공합니다.

+--------+ +--------+ +--------+
| VM 1   | | VM 2   | | VM 3   |
| vGPU   | | vGPU   | | vGPU   |
+--------+ +--------+ +--------+
| vGPU Manager                  |
+------+-------+-------+-------+
| MIG 1| MIG 2 | MIG 3 | MIG 4 |  <-- 하드웨어 격리
+------+-------+-------+-------+
|        Physical GPU           |
+-------------------------------+

- MIG 슬라이스 간: 하드웨어 격리
- 같은 MIG 내 여러 vGPU: 타임슬라이싱

SR-IOV for GPUs

PCIe 수준의 가상화

[SR-IOV 구조]

PCIe Bus
    |
+---+---+---+---+---+---+
| PF    | VF 0 | VF 1 | VF 2 |
| (Host)| (VM1)| (VM2)| (VM3)|
+-------+------+------+------+
|          Physical GPU       |
+-----------------------------+

PF = Physical Function (호스트 관리)
VF = Virtual Function (VM에 직접 할당)

NVIDIA Ampere 이상에서 GPU SR-IOV 지원
AMD MxGPU도 SR-IOV 기반 (Radeon Instinct/Pro)
각 VF는 독립적인 PCIe 기능으로 IOMMU를 통해 격리
PF가 GPU 전체를 관리하고 VF를 생성/구성

Kubernetes에서의 GPU 타임슬라이싱

가장 간단한 GPU 공유 방식으로, 특별한 하드웨어 기능 없이 소프트웨어만으로 구현됩니다.

[K8s GPU Time-Slicing]

+---Pod A---+ +---Pod B---+ +---Pod C---+
|  Process  | |  Process  | |  Process  |
|  CUDA     | |  CUDA     | |  CUDA     |
+-----------+ +-----------+ +-----------+
|    NVIDIA Device Plugin (time-slicing)  |
+-----------------------------------------+
|          Physical GPU                   |
|     (모든 프로세스가 공유)              |
+-----------------------------------------+

- NVIDIA GPU Operator의 TimeSlicing 설정
- 하드웨어 격리 없음
- 메모리 공간 공유 (OOM 가능)
- MPS(Multi-Process Service) 옵션으로 성능 개선 가능

# NVIDIA GPU Operator ConfigMap 예시
# time-slicing-config
apiVersion: v1
kind: ConfigMap
metadata:
  name: time-slicing-config
data:
  any: |-
    version: v1
    flags:
      migStrategy: none
    sharing:
      timeSlicing:
        renameByDefault: false
        failRequestsGreaterThanOne: false
        resources:
          - name: nvidia.com/gpu
            replicas: 4

종합 비교표

항목	Passthrough	vGPU (mdev)	vGPU (SR-IOV)	MIG	Time-Slicing
GPU 공유	불가	가능 (SW)	가능 (HW)	가능 (HW)	가능 (SW)
격리 수준	완전 (독점)	시간 분할	PCIe VF	공간 분할	없음
메모리 격리	완전	VRAM 파티션	VRAM 파티션	HBM 파티션	공유
성능	95~99%	85~95%	88~96%	85~95%	가변적
최대 인스턴스	1	GPU 의존	VF 수 의존	최대 7	제한 없음
지원 GPU	모든 GPU	NVIDIA vGPU	NVIDIA Ampere+	A100/H100 등	모든 NVIDIA
라이선스	불필요	NVIDIA vGPU	NVIDIA vGPU	불필요	불필요
주 용도	전용 GPU VM	VDI, 다중 사용자	엔터프라이즈	AI/ML 멀티테넌시	K8s 개발/테스트

선택 가이드

[GPU 가상화 방식 선택 플로우차트]

시작: GPU를 공유해야 하나요?
  |
  +-- 아니오 --> GPU Passthrough
  |              (최고 성능, 1:1 할당)
  |
  +-- 예 --> 하드웨어 격리가 필요한가요?
               |
               +-- 아니오 --> Kubernetes인가요?
               |               |
               |               +-- 예 --> Time-Slicing
               |               |          (가장 간단)
               |               |
               |               +-- 아니오 --> vGPU (mdev)
               |                              (VM 기반 공유)
               |
               +-- 예 --> A100/H100 등 MIG 지원 GPU인가요?
                           |
                           +-- 예 --> MIG 또는 MIG-backed vGPU
                           |          (최고 수준 격리)
                           |
                           +-- 아니오 --> vGPU (SR-IOV)
                                          (Ampere+ HW 파티셔닝)

퀴즈: GPU 가상화 이해도 점검

Q1. GPU Passthrough와 vGPU의 근본적인 차이점은?

GPU Passthrough는 물리 GPU를 하나의 VM에 독점 할당하여 네이티브 성능을 제공합니다. vGPU는 하나의 GPU를 여러 VM이 공유할 수 있도록 하며, 타임슬라이싱 또는 SR-IOV를 통해 리소스를 분배합니다.

Q2. MIG가 다른 GPU 공유 방식보다 격리가 우수한 이유는?

MIG는 GPU를 하드웨어 수준에서 공간적으로 분할합니다. 각 인스턴스가 전용 SM, 메모리 컨트롤러, L2 캐시, HBM을 보유하므로 인스턴스 간 리소스 경합이 원천적으로 없습니다. 에러도 격리됩니다.

Q3. SR-IOV에서 PF와 VF의 역할은?

PF(Physical Function)는 호스트가 GPU 전체를 관리하는 데 사용하는 PCIe 기능입니다. VF(Virtual Function)는 PF에서 생성된 가상 PCIe 기능으로, 각 VM에 직접 할당되어 독립적인 GPU 접근을 제공합니다.

Q4. Kubernetes Time-Slicing의 가장 큰 약점은?

하드웨어 격리가 전혀 없으며 GPU 메모리를 모든 Pod가 공유합니다. 따라서 한 Pod의 과도한 메모리 사용이 다른 Pod에 OOM을 유발할 수 있고, 보안 경계도 불분명합니다.

Q5. vGPU를 사용하려면 반드시 필요한 것은?

NVIDIA vGPU Software License가 필요합니다. 라이선스 서버(DLS 또는 CLS)를 구축해야 하며, vGPU Manager 소프트웨어를 하이퍼바이저에 설치해야 합니다. GPU Passthrough나 MIG 단독 사용에는 라이선스가 불필요합니다.

[Virtualization] 04. Complete Guide to GPU Virtualization: Passthrough, vGPU, MIG, SR-IOV

Introduction
GPU Passthrough (VFIO-PCI)
vGPU (NVIDIA GRID)
MIG (Multi-Instance GPU)
SR-IOV for GPUs
- PCIe-Level Virtualization
GPU Time-Slicing in Kubernetes
Comprehensive Comparison Table
Decision Guide

Introduction

GPU virtualization has become a critical infrastructure technology due to the explosive growth of AI/ML workloads and cloud computing. Various approaches exist, from exclusively assigning a physical GPU to a single VM to sharing one GPU across multiple VMs or containers.

[GPU Virtualization Technology Spectrum]

Exclusive <----------------------------------------> Maximum Sharing

GPU Passthrough    vGPU (SR-IOV)    MIG    vGPU (mdev)    Time-Slicing
 (1 GPU : 1 VM)   (HW partition)  (spatial) (SW timeslice)  (K8s share)

GPU Passthrough (VFIO-PCI)

Overview

GPU Passthrough directly attaches a physical GPU to a VM -- the simplest and most powerful approach.

+------------------+     +------------------+
|    VM 1          |     |    VM 2          |
|  GPU Driver      |     |  GPU Driver      |
|  (Full Access)   |     |  (Full Access)   |
+--------+---------+     +--------+---------+
         |                         |
    IOMMU/VT-d                IOMMU/VT-d
         |                         |
+--------+---------+     +--------+---------+
|  Physical GPU 1  |     |  Physical GPU 2  |
|  (Exclusive)     |     |  (Exclusive)     |
+------------------+     +------------------+

Characteristics

Aspect	Details
Performance	95-99% of native
GPU Sharing	Not possible (1 GPU = 1 VM)
Supported GPUs	All GPUs (NVIDIA, AMD, Intel)
Licensing	No additional license needed
Requirements	IOMMU (VT-d/AMD-Vi) capable CPU/motherboard
Drivers	Standard GPU driver installed in guest OS
Limitation	Only one VM per GPU

Setup Flow

# 1. Enable VT-d/AMD-Vi (IOMMU) in BIOS

# 2. Add IOMMU to boot parameters
# /etc/default/grub:
# GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt"

# 3. Check IOMMU groups
find /sys/kernel/iommu_groups/ -type l | sort -V

# 4. Bind GPU to vfio-pci driver
# /etc/modprobe.d/vfio.conf:
# options vfio-pci ids=10de:2484,10de:228b

# 5. Rebuild initramfs
update-initramfs -u

# 6. Verify after reboot
lspci -nnk -s 01:00.0
# Kernel driver in use: vfio-pci

vGPU (NVIDIA GRID)

Architecture

vGPU enables sharing a single physical GPU across multiple VMs.

+--------+ +--------+ +--------+ +--------+
| VM 1   | | VM 2   | | VM 3   | | VM 4   |
| vGPU   | | vGPU   | | vGPU   | | vGPU   |
| 4GB    | | 4GB    | | 4GB    | | 4GB    |
+--------+ +--------+ +--------+ +--------+
|            vGPU Manager                  |
|        (Hypervisor Module)               |
+------------------------------------------+
|          Physical GPU (16GB)             |
|        (NVIDIA A100/L40S/etc.)          |
+------------------------------------------+

mdev (Mediated Devices) - Pre-Volta

[Software Time-Slicing]

Time -->  |  VM1  |  VM2  |  VM3  |  VM1  |  VM2  |  VM3  |
          +-------+-------+-------+-------+-------+-------+
Full GPU   used    used    used    used    used    used

- Entire GPU shared via time-division
- vGPU Manager handles context switching
- VRAM statically partitioned
- Compute resources time-shared

SR-IOV - Ampere and Later

[Hardware Partitioning - SR-IOV]

+-------------------------------------------+
| Physical Function (PF)                     |
| - GPU management and configuration        |
+-------------------------------------------+
| VF 0        | VF 1        | VF 2          |
| (VM 1)      | (VM 2)      | (VM 3)        |
| Own queues  | Own queues  | Own queues     |
| Own IRQs    | Own IRQs    | Own IRQs       |
+-------------------------------------------+
|          Physical GPU (PCIe)              |
+-------------------------------------------+

- PF (Physical Function): Managed by host
- VF (Virtual Function): Directly assigned to each VM
- IOMMU ensures memory isolation between VFs
- Hardware-level isolation for better stability

vGPU Profile Series

Series	Purpose	VRAM	Typical Use Cases
C-series	Compute	Large	AI/ML training, HPC
Q-series	Quadro	Medium-Large	CAD, 3D rendering, professional graphics
B-series	Desktop	Small-Medium	VDI, general desktop
A-series	App Streaming	Small-Medium	App virtualization, remote workstations

[NVIDIA A100 vGPU Profile Examples]

GPU: NVIDIA A100 40GB

Profile           VRAM    Max Instances
A100-1-5C         5GB      7
A100-2-10C       10GB      4
A100-4-20C       20GB      2
A100-8-40C       40GB      1
A100-1-5CME       5GB      7  (MIG-backed)
A100-2-10CME     10GB      4  (MIG-backed)

License Requirements:

NVIDIA vGPU Software License required (vApps, vPC, vCS, vWS)
License server (DLS or CLS) needed
License type determined by GPU model and profile

MIG (Multi-Instance GPU)

Overview

MIG is a hardware-level spatial partitioning technology first introduced with NVIDIA A100.

[MIG Architecture - A100 Example]

+-----------------------------------------------------------+
|                    NVIDIA A100 (80GB)                       |
+-----------------------------------------------------------+
| GPC 0-1  | GPC 2-3  | GPC 4   | GPC 5   | GPC 6          |
| MIG 1    | MIG 2    | MIG 3   | MIG 4   | MIG 5          |
| 2g.20gb  | 2g.20gb  | 1g.10gb | 1g.10gb | 1g.10gb        |
+----------+----------+---------+---------+-----------------+
| Memory   | Memory   | Memory  | Memory  | Memory          |
| Ctrl 0-1 | Ctrl 2-3 | Ctrl 4  | Ctrl 5  | Ctrl 6          |
| 20GB HBM | 20GB HBM | 10GB HBM| 10GB HBM| 10GB HBM       |
+----------+----------+---------+---------+-----------------+
| L2 Cache | L2 Cache | L2 $    | L2 $    | L2 $            |
| Slice 0-1| Slice 2-3| Slice 4 | Slice 5 | Slice 6         |
+-----------------------------------------------------------+

Key Characteristics

Hardware Spatial Partitioning: Up to 7 independent instances
Dedicated Resources: Each instance gets its own SMs, memory controllers, L2 cache, and HBM
Complete Isolation: No resource contention between instances
Error Isolation: Errors in one instance do not affect others

Supported GPUs

GPU	Max Instances	HBM	Architecture
A100 40GB	7	40GB HBM2e	Ampere
A100 80GB	7	80GB HBM2e	Ampere
A30	4	24GB HBM2e	Ampere
H100	7	80GB HBM3	Hopper
H200	7	141GB HBM3e	Hopper

MIG Configuration Example

# Enable MIG mode (reboot required)
sudo nvidia-smi -i 0 --mig 1

# After reboot, create MIG instances
# List available profiles
nvidia-smi mig -lgip

# Create two 3g.40gb instances (A100 80GB)
sudo nvidia-smi mig -cgi 9,9 -C

# Verify created instances
nvidia-smi mig -lgi
nvidia-smi mig -lci

# Delete MIG instances
sudo nvidia-smi mig -dci
sudo nvidia-smi mig -dgi

# Disable MIG mode (reboot required)
sudo nvidia-smi -i 0 --mig 0

MIG-backed vGPU

Combining MIG with vGPU provides both hardware isolation and VM support.

+--------+ +--------+ +--------+
| VM 1   | | VM 2   | | VM 3   |
| vGPU   | | vGPU   | | vGPU   |
+--------+ +--------+ +--------+
| vGPU Manager                  |
+------+-------+-------+-------+
| MIG 1| MIG 2 | MIG 3 | MIG 4 |  <-- Hardware isolation
+------+-------+-------+-------+
|        Physical GPU           |
+-------------------------------+

- Between MIG slices: Hardware isolation
- Multiple vGPUs within same MIG: Time-slicing

SR-IOV for GPUs

PCIe-Level Virtualization

[SR-IOV Structure]

PCIe Bus
    |
+---+---+---+---+---+---+
| PF    | VF 0 | VF 1 | VF 2 |
| (Host)| (VM1)| (VM2)| (VM3)|
+-------+------+------+------+
|          Physical GPU       |
+-----------------------------+

PF = Physical Function (host management)
VF = Virtual Function (direct VM assignment)

NVIDIA supports GPU SR-IOV from Ampere onward
AMD MxGPU is also SR-IOV-based (Radeon Instinct/Pro)
Each VF is an independent PCIe function, isolated via IOMMU
PF manages the entire GPU and creates/configures VFs

GPU Time-Slicing in Kubernetes

The simplest GPU sharing approach, implemented purely in software without special hardware features.

[K8s GPU Time-Slicing]

+---Pod A---+ +---Pod B---+ +---Pod C---+
|  Process  | |  Process  | |  Process  |
|  CUDA     | |  CUDA     | |  CUDA     |
+-----------+ +-----------+ +-----------+
|    NVIDIA Device Plugin (time-slicing)  |
+-----------------------------------------+
|          Physical GPU                   |
|     (shared by all processes)           |
+-----------------------------------------+

- NVIDIA GPU Operator TimeSlicing configuration
- No hardware isolation
- Shared memory space (OOM possible)
- MPS (Multi-Process Service) option for better performance

# NVIDIA GPU Operator ConfigMap example
# time-slicing-config
apiVersion: v1
kind: ConfigMap
metadata:
  name: time-slicing-config
data:
  any: |-
    version: v1
    flags:
      migStrategy: none
    sharing:
      timeSlicing:
        renameByDefault: false
        failRequestsGreaterThanOne: false
        resources:
          - name: nvidia.com/gpu
            replicas: 4

Comprehensive Comparison Table

Aspect	Passthrough	vGPU (mdev)	vGPU (SR-IOV)	MIG	Time-Slicing
GPU Sharing	No	Yes (SW)	Yes (HW)	Yes (HW)	Yes (SW)
Isolation Level	Full (exclusive)	Time-sliced	PCIe VF	Spatial	None
Memory Isolation	Full	VRAM partition	VRAM partition	HBM partition	Shared
Performance	95-99%	85-95%	88-96%	85-95%	Variable
Max Instances	1	GPU-dependent	VF count	Up to 7	Unlimited
Supported GPUs	All GPUs	NVIDIA vGPU	NVIDIA Ampere+	A100/H100 etc.	All NVIDIA
Licensing	None	NVIDIA vGPU	NVIDIA vGPU	None	None
Primary Use	Dedicated GPU VM	VDI, multi-user	Enterprise	AI/ML multi-tenancy	K8s dev/test

Decision Guide

[GPU Virtualization Selection Flowchart]

Start: Do you need to share the GPU?
  |
  +-- No --> GPU Passthrough
  |          (Best performance, 1:1 assignment)
  |
  +-- Yes --> Do you need hardware isolation?
               |
               +-- No --> Is this Kubernetes?
               |           |
               |           +-- Yes --> Time-Slicing
               |           |          (Simplest approach)
               |           |
               |           +-- No --> vGPU (mdev)
               |                      (VM-based sharing)
               |
               +-- Yes --> MIG-capable GPU (A100/H100)?
                           |
                           +-- Yes --> MIG or MIG-backed vGPU
                           |          (Best isolation)
                           |
                           +-- No --> vGPU (SR-IOV)
                                      (Ampere+ HW partitioning)

Quiz: GPU Virtualization Knowledge Check

Q1. What is the fundamental difference between GPU Passthrough and vGPU?

GPU Passthrough exclusively assigns a physical GPU to a single VM, providing native performance. vGPU allows multiple VMs to share one GPU, distributing resources through time-slicing or SR-IOV.

Q2. Why does MIG provide better isolation than other GPU sharing methods?

MIG spatially partitions the GPU at the hardware level. Each instance has dedicated SMs, memory controllers, L2 cache, and HBM, fundamentally eliminating resource contention between instances. Errors are also isolated.

Q3. What are the roles of PF and VF in SR-IOV?

PF (Physical Function) is the PCIe function used by the host to manage the entire GPU. VFs (Virtual Functions) are created from the PF and directly assigned to VMs, providing independent GPU access.

Q4. What is the biggest weakness of Kubernetes Time-Slicing?

There is no hardware isolation, and all Pods share GPU memory. Excessive memory usage by one Pod can cause OOM in others, and security boundaries are unclear.

Q5. What is absolutely required to use vGPU?

An NVIDIA vGPU Software License is required. A license server (DLS or CLS) must be deployed, and vGPU Manager software must be installed on the hypervisor. GPU Passthrough and standalone MIG do not require licensing.