Skip to content
Published on

OpenStack & KVM Virtualization Complete Guide 2025: Building and Operating Private Cloud Infrastructure

Authors

Table of Contents

1. Virtualization Fundamentals

1.1 What Is Virtualization and Why Does It Matter?

Virtualization is the technology of creating multiple isolated virtual environments on top of physical hardware. Born in the 1960s on IBM mainframes, it became the backbone of modern cloud computing with the popularization of x86 virtualization in the 2000s.

Why virtualization matters:

  • Server consolidation: Increase utilization from 10-15% to 60-80%
  • Cost reduction: Dramatically cut hardware, power, cooling, and space costs
  • Agility: Provision new servers in minutes (physical servers take weeks)
  • Isolation: Security isolation between workloads, fault isolation
  • Disaster recovery: Snapshots, live migration for fast recovery

1.2 Type-1 vs Type-2 Hypervisors

A hypervisor is the software layer that creates and manages virtual machines.

Type-1 (Bare-Metal) Hypervisors:

Installed directly on hardware without requiring a host OS. Lower overhead and better performance.

  • KVM: Linux kernel module. Linux itself becomes the hypervisor
  • VMware ESXi: Enterprise standard with the vSphere ecosystem
  • Xen: Original foundation of AWS EC2. Managed by Citrix
  • Microsoft Hyper-V: Built into Windows Server

Type-2 (Hosted) Hypervisors:

Installed as an application on top of an existing OS. Suitable for development and testing.

  • VirtualBox: Oracle's open-source cross-platform solution
  • VMware Workstation/Fusion: Desktop virtualization
  • Parallels Desktop: macOS only

1.3 Virtualization Approaches Compared

+--------------------------------------------------+
|    Virtualization Approaches Comparison            |
+--------------------------------------------------+
| Full Virtualization                                |
|   - No guest OS modification needed               |
|   - Binary Translation for privileged instructions |
|   - Performance overhead present                   |
+--------------------------------------------------+
| Para-virtualization                                |
|   - Requires guest OS kernel modification          |
|   - Xen is the primary example                    |
|   - Good performance but limited compatibility     |
+--------------------------------------------------+
| Hardware-Assisted Virtualization                   |
|   - Intel VT-x / AMD-V                            |
|   - No guest modification + high performance       |
|   - KVM leverages this approach                    |
+--------------------------------------------------+

1.4 Hypervisor Comparison Table

FeatureKVMVMware ESXiXenHyper-V
TypeType-1Type-1Type-1Type-1
LicenseOpen Source (GPL)Commercial (free edition)Open Source (GPL)Windows included
Host OSLinuxProprietary kernelProprietary kernelWindows Server
PerformanceExcellentExcellentExcellentGood
GPU PassthroughSupported (VFIO)Supported (vGPU)LimitedSupported (DDA)
Live MigrationSupportedvMotionSupportedLive Migration
Managementlibvirt, OpenStackvCenterXenCenterSCVMM
EcosystemVast (Linux)Largest enterpriseShrinkingWindows ecosystem
Edge DeploymentEasyHeavyPossibleLimited

1.5 Containers vs Virtual Machines

  Virtual Machines (VMs)            Containers
+------------------+           +------------------+
|   App A | App B  |           | App A  |  App B  |
+------------------+           +------------------+
| Guest  | Guest   |           | Bins/  | Bins/   |
|  OS    |  OS     |           | Libs   | Libs    |
+------------------+           +------------------+
|   Hypervisor     |           | Container Engine  |
+------------------+           +------------------+
|   Host OS        |           |   Host OS         |
+------------------+           +------------------+
|   Hardware       |           |   Hardware        |
+------------------+           +------------------+
FeatureVirtual MachinesContainers
Isolation LevelStrong (hardware-level)Weaker (shared kernel)
Startup TimeTens of seconds to minutesMilliseconds to seconds
Resource OverheadHigh (full OS per VM)Low (shared kernel)
Image SizeGB rangeMB range
Density10-50 VMs/server100-1000 containers/server
SecurityStrong isolationEnhanced with gVisor, Kata
Use CasesMulti-OS, legacy apps, high isolationMicroservices, CI/CD, scale-out

When to choose VMs:

  • Running different operating systems (Windows + Linux)
  • Strong security isolation required (multi-tenant)
  • Legacy application operations
  • Kernel-level customization needed

When to choose containers:

  • Many instances on the same OS
  • Fast scaling required
  • CI/CD pipelines
  • Microservices architecture

2. KVM/QEMU Deep Dive

2.1 KVM Architecture

KVM (Kernel-based Virtual Machine) is a hypervisor module included in the Linux kernel since version 2.6.20. It transforms Linux itself into a Type-1 hypervisor.

+--------------------------------------------------+
|          Guest VMs (each VM is a Linux process)    |
|  +-----------+  +-----------+  +-----------+      |
|  | Guest OS  |  | Guest OS  |  | Guest OS  |      |
|  +-----------+  +-----------+  +-----------+      |
|  |   QEMU    |  |   QEMU    |  |   QEMU    |      |
|  | (userspace)|  | (userspace)|  | (userspace)|    |
+--------------------------------------------------+
|              Linux Kernel                         |
|  +------+  +-------+  +--------+  +---------+    |
|  | KVM  |  | sched |  | memory |  | network |    |
|  |module |  |       |  | mgmt   |  | stack   |    |
|  +------+  +-------+  +--------+  +---------+    |
+--------------------------------------------------+
|              Hardware                             |
|  +----------+  +-------+  +--------+             |
|  | CPU      |  | RAM   |  | NIC    |             |
|  | VT-x/SVM |  |       |  |        |             |
|  +----------+  +-------+  +--------+             |
+--------------------------------------------------+

How it works:

  1. KVM kernel modules (kvm.ko, kvm-intel.ko or kvm-amd.ko) leverage CPU hardware virtualization extensions (VT-x/AMD-V)
  2. Each VM runs as a regular Linux process (QEMU process)
  3. vCPUs are scheduled as Linux threads
  4. QEMU handles device emulation (disk, network, USB, etc.)
# Check KVM support
grep -E '(vmx|svm)' /proc/cpuinfo

# Load KVM modules
sudo modprobe kvm
sudo modprobe kvm_intel  # Intel CPU
# sudo modprobe kvm_amd  # AMD CPU

# Verify KVM device
ls -la /dev/kvm

2.2 vCPU Pinning and NUMA Topology

In NUMA (Non-Uniform Memory Access) architecture, the physical location of CPUs and memory significantly impacts performance.

# Check NUMA topology
numactl --hardware
# node 0: cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
# node 0: size: 65536 MB
# node 1: cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
# node 1: size: 65536 MB

# Visualize with lstopo
lstopo --of png > topology.png

vCPU Pinning Configuration (libvirt XML):

<vcpu placement='static'>8</vcpu>
<cputune>
  <vcpupin vcpu='0' cpuset='0'/>
  <vcpupin vcpu='1' cpuset='1'/>
  <vcpupin vcpu='2' cpuset='2'/>
  <vcpupin vcpu='3' cpuset='3'/>
  <vcpupin vcpu='4' cpuset='4'/>
  <vcpupin vcpu='5' cpuset='5'/>
  <vcpupin vcpu='6' cpuset='6'/>
  <vcpupin vcpu='7' cpuset='7'/>
  <emulatorpin cpuset='16-17'/>
</cputune>
<numatune>
  <memory mode='strict' nodeset='0'/>
</numatune>

2.3 virtio: Para-virtualized I/O

virtio provides a standard interface for high-performance I/O between guest and host. Instead of hardware emulation, it uses optimized virtual drivers.

virtio DevicePurposeReplaces
virtio-netNetworkinge1000, rtl8139
virtio-blkBlock storageIDE, SATA
virtio-scsiSCSI storageLSI Logic
virtio-balloonMemory adjustment-
virtio-rngRandom number generation-
virtio-gpuGraphicsQXL, VGA
virtio-fsHost file sharing9p
# Using virtio with QEMU
qemu-system-x86_64 \
  -drive file=disk.qcow2,if=virtio \
  -netdev tap,id=net0,ifname=tap0,script=no \
  -device virtio-net-pci,netdev=net0 \
  -m 4G \
  -smp 4 \
  -enable-kvm

2.4 Memory Management

Balloon Driver:

Dynamically expands and shrinks memory inside the guest, allowing the host to reclaim memory.

# Adjust balloon size (virsh)
virsh qemu-monitor-command myvm --hmp 'balloon 2048'

KSM (Kernel Same-page Merging):

Shares identical memory pages across VMs to save physical memory.

# Enable KSM
echo 1 > /sys/kernel/mm/ksm/run
echo 1000 > /sys/kernel/mm/ksm/pages_to_scan
echo 20 > /sys/kernel/mm/ksm/sleep_millisecs

# Check KSM status
cat /sys/kernel/mm/ksm/pages_shared
cat /sys/kernel/mm/ksm/pages_sharing

Hugepages:

Use 2MB or 1GB large pages instead of the default 4KB pages to reduce TLB misses.

# Configure hugepages
echo 4096 > /proc/sys/vm/nr_hugepages  # 4096 x 2MB = 8GB

# Add to /etc/default/grub
# GRUB_CMDLINE_LINUX="hugepagesz=1G hugepages=32 default_hugepagesz=1G"
<!-- Hugepages in libvirt XML -->
<memoryBacking>
  <hugepages>
    <page size='1' unit='GiB'/>
  </hugepages>
</memoryBacking>

2.5 SR-IOV and GPU Passthrough

SR-IOV (Single Root I/O Virtualization):

Splits a physical NIC into multiple Virtual Functions (VFs) for direct assignment to VMs.

# Create SR-IOV VFs
echo 4 > /sys/class/net/enp3s0f0/device/sriov_numvfs

# List VFs
lspci | grep "Virtual Function"
<!-- Assign VF to VM (libvirt XML) -->
<hostdev mode='subsystem' type='pci' managed='yes'>
  <source>
    <address domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </source>
</hostdev>

GPU Passthrough (VFIO):

# Enable IOMMU (GRUB)
# Intel: intel_iommu=on iommu=pt
# AMD: amd_iommu=on iommu=pt

# Bind GPU to vfio-pci
echo "10de 2204" > /sys/bus/pci/drivers/vfio-pci/new_id

2.6 Live Migration

Move a running VM to another host with zero downtime.

Pre-copy Migration Flow:
1. Transfer entire memory to destination host
2. Iteratively send dirty pages
3. When dirty pages are small enough, pause VM
4. Transfer remaining dirty pages
5. Resume VM on destination host
# virsh live migration
virsh migrate --live --persistent --undefinesource \
  myvm qemu+ssh://dest-host/system

# Set migration bandwidth limit (MB/s)
virsh migrate-setspeed myvm 100

# Check migration status
virsh domjobinfo myvm

3. libvirt Management

3.1 libvirt Architecture

libvirt is a unified API and daemon for managing various hypervisors.

+--------------------------------------------------+
|  Management Tools                                 |
|  +--------+ +----------+ +---------+ +--------+  |
|  | virsh  | |virt-      | |OpenStack| |oVirt   |  |
|  |        | |manager   | |Nova     | |        |  |
|  +--------+ +----------+ +---------+ +--------+  |
+--------------------------------------------------+
|  libvirt API (C, Python, Go, Java bindings)       |
+--------------------------------------------------+
|  libvirtd daemon                                  |
+--------------------------------------------------+
|  Drivers                                          |
|  +-----+ +------+ +------+ +-------+ +--------+  |
|  | KVM | | Xen  | | LXC  | | VBox  | | VMware |  |
|  +-----+ +------+ +------+ +-------+ +--------+  |
+--------------------------------------------------+

3.2 Essential virsh Commands

# VM lifecycle management
virsh list --all                    # List all VMs
virsh start myvm                    # Start VM
virsh shutdown myvm                 # Graceful shutdown
virsh destroy myvm                  # Force stop (power off)
virsh reboot myvm                   # Reboot
virsh suspend myvm                  # Pause (preserve memory state)
virsh resume myvm                   # Resume

# VM creation/deletion
virsh define vm.xml                 # Define VM from XML
virsh undefine myvm                 # Remove VM definition
virsh undefine myvm --remove-all-storage  # Remove with storage

# Snapshot management
virsh snapshot-create-as myvm snap1 "First snapshot"
virsh snapshot-list myvm
virsh snapshot-revert myvm snap1
virsh snapshot-delete myvm snap1

# Resource adjustment
virsh setvcpus myvm 4 --live        # Hot-add vCPUs
virsh setmem myvm 8G --live         # Adjust memory

# Console/display access
virsh console myvm                  # Serial console
virsh vncdisplay myvm               # VNC port info

3.3 XML Domain Configuration

<domain type='kvm'>
  <name>web-server-01</name>
  <uuid>auto-generated</uuid>
  <memory unit='GiB'>8</memory>
  <currentMemory unit='GiB'>8</currentMemory>
  <vcpu placement='static'>4</vcpu>

  <os>
    <type arch='x86_64' machine='pc-q35-8.2'>hvm</type>
    <boot dev='hd'/>
  </os>

  <features>
    <acpi/>
    <apic/>
  </features>

  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='1'/>
  </cpu>

  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>

    <!-- virtio disk -->
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback' discard='unmap'/>
      <source file='/var/lib/libvirt/images/web-server-01.qcow2'/>
      <target dev='vda' bus='virtio'/>
    </disk>

    <!-- virtio network -->
    <interface type='bridge'>
      <source bridge='br0'/>
      <model type='virtio'/>
    </interface>

    <!-- VNC display -->
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'/>

    <!-- Serial console -->
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
  </devices>
</domain>

3.4 Storage Pools

# Directory-based storage pool
virsh pool-define-as default dir --target /var/lib/libvirt/images
virsh pool-build default
virsh pool-start default
virsh pool-autostart default

# LVM storage pool
virsh pool-define-as lvm-pool logical \
  --source-name vg_vms --target /dev/vg_vms

# Ceph RBD storage pool
virsh pool-define-as ceph-pool rbd \
  --source-host mon1.example.com \
  --source-name libvirt-pool \
  --auth-type ceph --auth-username libvirt

# Volume management
virsh vol-create-as default myvm.qcow2 50G --format qcow2
virsh vol-list default
virsh vol-delete myvm.qcow2 default

3.5 Network Configuration

# NAT network (default)
virsh net-list --all

# Bridge network setup (netplan example)
# /etc/netplan/01-bridge.yaml
network:
  version: 2
  ethernets:
    enp3s0:
      dhcp4: false
  bridges:
    br0:
      interfaces: [enp3s0]
      addresses: [192.168.1.100/24]
      routes:
        - to: default
          via: 192.168.1.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

4. OpenStack Architecture (The Big Picture)

4.1 OpenStack Overview

OpenStack manages compute, storage, and network resources in a data center as pools, exposing them to users through self-service APIs. It is essentially a cloud operating system.

+------------------------------------------------------------+
|                    Horizon (Dashboard)                       |
+------------------------------------------------------------+
|                    OpenStack API Layer                       |
+------+-------+--------+--------+-------+--------+----------+
| Key- | Nova  |Neutron | Cinder | Glance| Swift  | Heat     |
|stone | Comp- | Net-   | Block  | Image | Object | Orches-  |
| Auth | ute   | work   | Store  |       | Store  | tration  |
+------+-------+--------+--------+-------+--------+----------+
|  RabbitMQ (Message Queue)  |  MariaDB/Galera (Database)    |
+------------------------------------------------------------+
|               Hypervisor (KVM/QEMU)                         |
+------------------------------------------------------------+
|               Physical Infrastructure                       |
+------------------------------------------------------------+

4.2 Core Services in Detail

Keystone (Identity and Authentication):

Features:
- Token-based authentication (Fernet, JWT)
- RBAC (Role-Based Access Control)
- Multi-tenancy (Project/Domain isolation)
- LDAP/AD integration, SAML/OIDC Federation
- Service catalog (endpoint registry)
# Issue Keystone token
openstack token issue

# Project/user management
openstack project create --domain default myproject
openstack user create --domain default --password-prompt myuser
openstack role add --project myproject --user myuser member

Nova (Compute):

Nova Internal Components:
- nova-api: REST API endpoint
- nova-scheduler: VM placement algorithm (FilterScheduler)
- nova-conductor: DB access proxy (security)
- nova-compute: Hypervisor driver (libvirt)
- Placement: Resource tracking service
# Create instance
openstack server create \
  --flavor m1.large \
  --image ubuntu-22.04 \
  --network internal-net \
  --security-group default \
  --key-name mykey \
  web-server-01

# Flavor management
openstack flavor create --vcpus 4 --ram 8192 --disk 80 m1.large

# Instance management
openstack server list
openstack server show web-server-01
openstack server resize web-server-01 m1.xlarge

Neutron (Networking):

Neutron Components:
- neutron-server: API server
- ML2 Plugin: Modular network driver
- OVS/OVN Agent: Virtual switch management
- L3 Agent: Routing, NAT
- DHCP Agent: IP allocation
- Metadata Agent: Instance metadata
# Create network
openstack network create internal-net
openstack subnet create --network internal-net \
  --subnet-range 10.0.1.0/24 \
  --gateway 10.0.1.1 \
  --dns-nameserver 8.8.8.8 \
  internal-subnet

# Create router
openstack router create main-router
openstack router set --external-gateway external-net main-router
openstack router add subnet main-router internal-subnet

# Floating IP
openstack floating ip create external-net
openstack server add floating ip web-server-01 203.0.113.10

# Security groups
openstack security group rule create --protocol tcp \
  --dst-port 80:80 --remote-ip 0.0.0.0/0 default
openstack security group rule create --protocol tcp \
  --dst-port 22:22 --remote-ip 10.0.0.0/8 default

Cinder (Block Storage):

# Create and attach volume
openstack volume create --size 100 data-vol
openstack server add volume web-server-01 data-vol

# Snapshots
openstack volume snapshot create --volume data-vol snap-before-upgrade

# Volume types (backend selection)
openstack volume type create --property volume_backend_name=ceph-ssd fast-ssd
openstack volume create --size 50 --type fast-ssd fast-data

Heat (Orchestration):

# HOT (Heat Orchestration Template) example
heat_template_version: 2021-04-16

description: Web server stack

parameters:
  image:
    type: string
    default: ubuntu-22.04
  flavor:
    type: string
    default: m1.large

resources:
  web_server:
    type: OS::Nova::Server
    properties:
      name: web-server
      image: { get_param: image }
      flavor: { get_param: flavor }
      networks:
        - network: internal-net
      security_groups:
        - default

  web_port:
    type: OS::Neutron::Port
    properties:
      network: internal-net

  floating_ip:
    type: OS::Neutron::FloatingIP
    properties:
      floating_network: external-net
      port_id: { get_resource: web_port }

outputs:
  server_ip:
    value: { get_attr: [floating_ip, floating_ip_address] }

4.3 Additional Services

ServiceProjectDescriptionAWS Equivalent
ComputeNovaVM managementEC2
NetworkingNeutronSDNVPC
Block StorageCinderDisksEBS
Object StorageSwiftDistributed storageS3
ImageGlanceVM imagesAMI
IdentityKeystoneIAMIAM
OrchestrationHeatStack managementCloudFormation
DashboardHorizonWeb UIConsole
Bare MetalIronicPhysical servers-
DNSDesignateDNS managementRoute 53
Load BalancerOctaviaLBaaSELB
File StorageManilaShared filesystemsEFS
ContainersMagnumK8s clustersEKS

5. OpenStack Deployment

5.1 Deployment Methods Compared

MethodPurposeComplexityProduction
DevStackDevelopment/TestingLowNot suitable
Kolla-AnsibleProductionMediumRecommended
TripleOLarge-scale productionHighYes
MicroStack (Snap)Single-node testingVery lowNot suitable
OpenStack-AnsibleProductionMediumYes

5.2 Minimum Hardware Requirements

Controller Nodes (3 for HA):
  - CPU: 8+ cores
  - RAM: 32GB+
  - Disk: 500GB SSD (OS + DB)
  - NIC: 2+ (management + service)

Compute Nodes (N nodes):
  - CPU: As many cores as possible (scales with VM count)
  - RAM: 128GB+ (sum of VM memory + overhead)
  - Disk: SSD (local ephemeral)
  - NIC: 2+

Storage Nodes (Ceph OSD, 3+ nodes):
  - CPU: 1 core per OSD
  - RAM: 4GB per OSD
  - Disk: NVMe/SSD (OSD) + SSD (WAL/DB)
  - NIC: 10GbE+

5.3 Kolla-Ansible Deployment

# 1. Preparation
pip install kolla-ansible

# 2. Inventory setup
cp /usr/share/kolla-ansible/ansible/inventory/multinode .
# Edit inventory file

# 3. Global configuration
# /etc/kolla/globals.yml
kolla_base_distro: "ubuntu"
kolla_install_type: "source"
openstack_release: "2024.2"

kolla_internal_vip_address: "10.0.0.100"
kolla_external_vip_address: "203.0.113.100"

network_interface: "eth0"
neutron_external_interface: "eth1"

enable_cinder: "yes"
enable_cinder_backend_lvm: "no"
enable_cinder_backend_ceph: "yes"

enable_heat: "yes"
enable_horizon: "yes"
enable_neutron_provider_networks: "yes"

ceph_cinder_user: "cinder"
ceph_cinder_pool_name: "volumes"
# 4. Generate passwords
kolla-genpwd

# 5. Bootstrap
kolla-ansible -i multinode bootstrap-servers

# 6. Pre-checks
kolla-ansible -i multinode prechecks

# 7. Deploy
kolla-ansible -i multinode deploy

# 8. Post-deploy (generate OpenRC/clouds.yaml)
kolla-ansible -i multinode post-deploy

# 9. Verify
source /etc/kolla/admin-openrc.sh
openstack service list
openstack compute service list
openstack network agent list

5.4 Network Architecture

+------------------------------------------------------------+
|  External Network (203.0.113.0/24)                          |
|                        |                                    |
|                  [External Router]                           |
|                        |                                    |
|  Management Network (10.0.0.0/24)                           |
|  +----------+  +----------+  +----------+                   |
|  |Controller|  |Controller|  |Controller|                   |
|  |   Node 1 |  |   Node 2 |  |   Node 3 |                   |
|  +----------+  +----------+  +----------+                   |
|        |              |              |                       |
|  Tunnel Network (10.0.1.0/24) - VXLAN                       |
|        |              |              |                       |
|  +----------+  +----------+  +----------+                   |
|  | Compute  |  | Compute  |  | Compute  |                   |
|  |  Node 1  |  |  Node 2  |  |  Node 3  |                   |
|  +----------+  +----------+  +----------+                   |
|        |              |              |                       |
|  Storage Network (10.0.2.0/24) - Ceph                       |
|        |              |              |                       |
|  +----------+  +----------+  +----------+                   |
|  |  Ceph    |  |  Ceph    |  |  Ceph    |                   |
|  |  OSD 1   |  |  OSD 2   |  |  OSD 3   |                   |
|  +----------+  +----------+  +----------+                   |
+------------------------------------------------------------+

6. Networking Deep Dive

6.1 Provider vs Self-service Networks

Provider Networks:

  • Created by administrators, mapped directly to physical networks
  • VLAN-based, uses external routing equipment
  • Simple but limited flexibility

Self-service (Tenant) Networks:

  • Tenants create freely
  • Isolated via VXLAN/GRE tunneling
  • Neutron L3 agent handles routing
  • Floating IPs for external access

6.2 OVS vs OVN

Open vSwitch (OVS):
- Software virtual switch
- Managed by neutron-openvswitch-agent
- Mature and stable
- Performance can degrade at scale

OVN (Open Virtual Network):
- SDN solution built on top of OVS
- Distributed architecture (no central agent needed)
- Native L3/NAT/DHCP/ACL support
- Superior performance at scale
- Recommended backend for modern OpenStack
# Check OVN configuration
ovn-sbctl show
ovn-nbctl show

# List logical switches/routers
ovn-nbctl ls-list
ovn-nbctl lr-list

6.3 VXLAN Tunneling

VXLAN (Virtual Extensible LAN):
- Encapsulates L2 frames over L3 via UDP port 4789
- 24-bit VNI allows ~16 million segments (VLAN: 4,096)
- Ideal for multi-tenant environments

Packet Structure:
+------+------+------+------+---------+------+------+
|Outer | Outer| Outer| VXLAN| Inner   |Inner |Inner |
|Ether | IP   | UDP  |Header| Ether   |  IP  |Payload|
+------+------+------+------+---------+------+------+

6.4 DVR (Distributed Virtual Router)

DVR distributes routing to each compute node, eliminating the network node bottleneck.

Without DVR (Centralized):
  VM -> Compute Node -> [Network Node: L3 Agent] -> External

With DVR:
  VM -> Compute Node [Local L3 Agent] -> External
  (East-West traffic also handled locally)

6.5 Security Groups and FWaaS

# Create security group
openstack security group create web-sg
openstack security group rule create \
  --protocol tcp --dst-port 80 --remote-ip 0.0.0.0/0 web-sg
openstack security group rule create \
  --protocol tcp --dst-port 443 --remote-ip 0.0.0.0/0 web-sg
openstack security group rule create \
  --protocol tcp --dst-port 22 --remote-ip 10.0.0.0/8 web-sg

7. Storage Architecture

7.1 Storage Types

+------------------------------------------------------------+
|  Ephemeral Storage                                          |
|  - Deleted when VM is deleted                               |
|  - Local disk (fast but non-persistent)                     |
|  - Can convert to permanent image via snapshot              |
+------------------------------------------------------------+
|  Persistent Block Storage (Cinder)                          |
|  - Independent lifecycle from VM                            |
|  - Can be reattached to different VMs                       |
|  - Supports snapshots, clones, encryption                   |
+------------------------------------------------------------+
|  Object Storage (Swift)                                     |
|  - REST API access                                          |
|  - Stores VM images, backup data                            |
|  - High durability (triple replication)                     |
+------------------------------------------------------------+
|  Shared File System (Manila)                                |
|  - NFS/CIFS shares                                          |
|  - Multiple VMs access simultaneously                       |
+------------------------------------------------------------+

7.2 Ceph Integration

Ceph is the best-integrated distributed storage for OpenStack.

Ceph + OpenStack Mapping:
- Ceph RBD  -> Cinder (block storage)
- Ceph RBD  -> Nova (ephemeral disks, live migration)
- Ceph RBD  -> Glance (image storage)
- Ceph RGW  -> Swift API compatible (object storage)
- CephFS    -> Manila (shared file systems)
# Create Ceph pools for OpenStack
ceph osd pool create volumes 128
ceph osd pool create images 64
ceph osd pool create vms 128
ceph osd pool create backups 64

# Create Ceph auth keys
ceph auth get-or-create client.cinder \
  mon 'profile rbd' \
  osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd pool=images'

ceph auth get-or-create client.glance \
  mon 'profile rbd' \
  osd 'profile rbd pool=images'

7.3 Storage Tiering

TierMediaUse CaseLatency
Tier 0NVMe SSDDatabases, high-performance VMsSub 0.1ms
Tier 1SATA SSDGeneral VMs, web serversSub 0.5ms
Tier 2HDD (10K RPM)Archive, backupsSub 5ms
Tier 3Object (S3)Cold dataTens of ms

8. Performance Optimization

8.1 CPU Optimization

# CPU pinning + NUMA awareness (nova.conf)
# [DEFAULT]
# vcpu_pin_set = 4-31
# reserved_host_cpus = 0-3
# cpu_allocation_ratio = 4.0

# [libvirt]
# cpu_mode = host-passthrough
<!-- High-performance VM: CPU pinning + Hugepages + NUMA -->
<domain type='kvm'>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <!-- ... -->
    <emulatorpin cpuset='0-1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <memoryBacking>
    <hugepages>
      <page size='1' unit='GiB'/>
    </hugepages>
  </memoryBacking>
  <cpu mode='host-passthrough'>
    <topology sockets='1' dies='1' cores='16' threads='1'/>
    <numa>
      <cell id='0' cpus='0-15' memory='32' unit='GiB'/>
    </numa>
  </cpu>
</domain>

8.2 Network Optimization

Performance ranking (low -> high):
1. virtio-net (software virtualization) - ~10Gbps
2. vhost-net (kernel-mode datapath) - ~15Gbps
3. DPDK (userspace packet processing) - ~40Gbps
4. SR-IOV (hardware passthrough) - near-native (~100Gbps)

8.3 Storage I/O Optimization

# virtio-scsi (multi-queue support)
# Supports more devices than virtio-blk, SCSI command support

# I/O scheduler (none recommended for NVMe)
echo none > /sys/block/nvme0n1/queue/scheduler
<!-- I/O thread separation (libvirt) -->
<iothreads>4</iothreads>
<disk type='file' device='disk'>
  <driver name='qemu' type='qcow2' cache='none' io='native' iothread='1'/>
  <source file='/var/lib/libvirt/images/vm.qcow2'/>
  <target dev='vda' bus='virtio'/>
</disk>

8.4 Overcommit Ratios

ResourceDefaultRecommended (General)Recommended (High-Perf)
CPU16:14:1 to 8:11:1 (pinned)
RAM1.5:11:1 to 1.2:11:1 (hugepages)
Disk1.0:12:1 (thin)1:1 (thick)

9. OpenStack vs VMware vs Proxmox

FeatureOpenStackVMware vSphereProxmox VE
LicenseOpen Source (Apache 2.0)Commercial (expensive)Open Source (AGPL) + commercial
HypervisorKVM (default)ESXiKVM + LXC
InstallationDifficultMediumVery easy
Learning CurveVery steepSteepGentle
ScalabilityThousands of nodesHundreds of nodesTens of nodes
API/AutomationPowerful REST APIvSphere API, PowerCLIREST API, CLI
Multi-tenancyStrong (Project/Domain)Limited (vCenter)Basic (Pool/Permission)
SDNNeutron (OVS/OVN)NSXLinux Bridge/OVS
StorageCinder + Ceph/NFS/iSCSIvSAN, VMFSZFS, Ceph, LVM
HAPacemaker, HAProxyvSphere HA, FTBuilt-in HA
Container SupportMagnum (K8s)TanzuBuilt-in LXC
Commercial SupportRed Hat, Canonical, SUSEVMware (Broadcom)Proxmox
TCOLow (ops staff needed)Very highVery low
Best ForLarge-scale cloudsEnterpriseSmall-medium

Selection Guide:

  • OpenStack: Large-scale private/public clouds, telcos, research institutions, when AWS-like self-service is needed
  • VMware: Existing enterprise environments, Windows-centric, vendor support required, budget available
  • Proxmox: SMBs, home labs, simple virtualization, rapid adoption

10. Production Best Practices

10.1 High Availability (HA)

Control Plane HA:
- Minimum 3 controller nodes
- HAProxy + Keepalived (VIP)
- Pacemaker/Corosync (cluster management)
- Galera Cluster (MariaDB synchronous replication)
- RabbitMQ mirrored queues

Data Plane HA:
- Ceph (triple replication, automatic node failure recovery)
- Neutron DVR (eliminates network node SPOF)
- Nova instance HA (Masakari)
# Check HAProxy status
echo "show stat" | socat stdio /var/run/haproxy/admin.sock

# Galera cluster status
mysql -e "SHOW STATUS LIKE 'wsrep_cluster_size';"
mysql -e "SHOW STATUS LIKE 'wsrep_cluster_status';"

# RabbitMQ cluster status
rabbitmqctl cluster_status

10.2 Monitoring

# Prometheus + OpenStack Exporter configuration
# prometheus.yml
scrape_configs:
  - job_name: 'openstack'
    static_configs:
      - targets: ['openstack-exporter:9198']
    metrics_path: '/metrics'

  - job_name: 'ceph'
    static_configs:
      - targets: ['ceph-mgr:9283']

  - job_name: 'libvirt'
    static_configs:
      - targets: ['compute1:9177', 'compute2:9177']

Key Monitoring Metrics:

CategoryMetricThreshold
NovaActive instance count80% of capacity
NovaScheduling failure rateAlert at 5%+
NeutronFloating IP utilizationAlert at 90%
CinderVolume creation failure rateAlert at 1%+
CephOSD statusAlert on 1+ down OSD
CephUtilizationWarning 70%, Critical 85%
RabbitMQQueue lengthAlert at 1000+
MariaDBReplication lagAlert at 1s+

10.3 Backup Strategies

# OpenStack database backup
mysqldump --all-databases --single-transaction \
  --routines --triggers > openstack-db-backup.sql

# Cinder volume backup (Ceph)
rbd snap create volumes/volume-UUID@backup-20260325
rbd export volumes/volume-UUID@backup-20260325 volume-backup.raw

# Glance image backup
openstack image save --file ubuntu-backup.qcow2 ubuntu-22.04

# Nova instance snapshot
openstack server image create --name "web-server-backup" web-server-01

10.4 Upgrade Strategy

Rolling Upgrade Procedure:
1. Upgrade controller nodes one at a time
   - Minimize service disruption
   - Leverage API backward compatibility
2. Run database migrations
3. Upgrade compute nodes one at a time
   - Live migrate VMs to create empty nodes for upgrade
4. Upgrade Neutron agents
5. Verify all services

11. Quiz

Q1. What type of hypervisor is KVM?

Answer: Type-1 (Bare-Metal) Hypervisor

KVM operates as a Linux kernel module, transforming the Linux kernel itself into a hypervisor. Since it is embedded in the host OS (Linux) that runs directly on hardware, it is classified as Type-1. It leverages Intel VT-x/AMD-V hardware virtualization extensions for high performance.

Q2. Which OpenStack service manages virtual networks, subnets, and routers?

Answer: Neutron

Neutron is the OpenStack Networking-as-a-Service component. It consists of the ML2 plugin, OVS/OVN agents, L3 agent, DHCP agent, and more. It manages virtual networks, subnets, routers, security groups, and Floating IPs. It is the equivalent of AWS VPC.

Q3. What is the key advantage of VXLAN tunneling over VLAN?

Answer: Segment count expansion (from 4,096 to approximately 16 million)

VLAN uses a 12-bit VLAN ID supporting a maximum of 4,096 segments. VXLAN uses a 24-bit VNI (Virtual Network Identifier), supporting approximately 16 million network segments. This is essential for large-scale multi-tenant cloud environments.

Q4. What does KSM (Kernel Same-page Merging) do?

Answer: Shares identical memory pages across VMs to reduce physical memory usage

KSM is a Linux kernel feature that merges identical memory pages from multiple VMs into a single physical page (Copy-on-Write). The more VMs running the same OS image, the greater the benefit. However, there is scanning overhead, so proper tuning is required.

Q5. Why is Kolla-Ansible the preferred method for OpenStack production deployment?

Answer: Containerized service deployment provides management ease, reproducibility, and upgrade convenience

Kolla-Ansible packages each OpenStack service as a Docker container for deployment. This enables environment isolation (no dependency conflicts), fast rollback, reproducible deployments, and independent per-service upgrades. Being Ansible-based makes automation and repeated deployments straightforward, with production HA configuration supported out of the box.


12. References

  1. OpenStack Official Documentation - docs.openstack.org
  2. KVM Official Site - linux-kvm.org
  3. libvirt Official Documentation - libvirt.org/docs.html
  4. QEMU Official Documentation - qemu.org/documentation
  5. Kolla-Ansible Documentation - docs.openstack.org/kolla-ansible/latest
  6. Ceph Official Documentation - docs.ceph.com
  7. Open vSwitch Documentation - docs.openvswitch.org
  8. OVN Architecture - ovn.org
  9. Red Hat OpenStack Platform - access.redhat.com/documentation/en-us/red_hat_openstack_platform
  10. Proxmox VE Documentation - pve.proxmox.com/wiki/Main_Page
  11. VMware vSphere Documentation - docs.vmware.com
  12. OpenStack Foundation (OpenInfra) - openinfra.dev
  13. DPDK Documentation - doc.dpdk.org
  14. SR-IOV Guide - kernel.org/doc/html/latest/PCI/sriov-howto.html