- Authors

- Name
- Youngju Kim
- @fjvbn20031
containerd Networking and Storage
containerd does not implement networking and storage directly but integrates with external plugins through standard interfaces. This post analyzes network configuration via CNI, namespace management, volume mounts, device access, and security module integration.
1. CNI Integration
1.1 CNI Overview
Container Network Interface (CNI) is the standard interface for container networking. containerd calls CNI plugins to configure networks.
CNI call flow:
kubelet -> containerd (CRI RunPodSandbox)
|
v
Create network namespace
|
v
Call CNI plugin
(ADD command)
|
v
IP allocation, routing setup, interface creation
|
v
Return result to containerd
1.2 CNI Configuration
CNI configuration file location:
Config directory: /etc/cni/net.d/
Binary directory: /opt/cni/bin/
containerd CNI configuration (config.toml):
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
max_conf_num = 1
1.3 CNI Plugin Chain
CNI config example (10-calico.conflist):
Network configuration is defined as a plugin chain:
1. Main plugin (calico, cilium, flannel, etc.):
- Create network interface
- IP allocation (IPAM)
- Routing rule setup
2. Meta plugin (bandwidth, portmap, etc.):
- Bandwidth limiting
- Port mapping
- Firewall rules
Execution order:
ADD: Main -> Meta plugins (forward)
DEL: Meta -> Main plugins (reverse)
1.4 CNI Call Details
CNI ADD execution detail:
1. containerd determines network namespace path
/var/run/netns/cni-abc123
2. Set CNI environment variables:
CNI_COMMAND=ADD
CNI_CONTAINERID=abc123
CNI_NETNS=/var/run/netns/cni-abc123
CNI_IFNAME=eth0
CNI_PATH=/opt/cni/bin
3. Execute CNI plugin binary
Pass config JSON via stdin
4. Plugin returns result via stdout:
- Assigned IP address
- Gateway address
- DNS configuration
- Routing information
5. containerd stores the result
2. Network Namespaces
2.1 Namespace Creation
Pod network namespace:
During Pod Sandbox creation:
1. Create new network namespace with unshare(CLONE_NEWNET)
2. Persist via bind mount at /var/run/netns/
3. Execute CNI plugins in this namespace
4. All containers in the Pod share this namespace
Namespace sharing:
Pause container holds the network namespace
App containers join the same namespace
-> Containers in Pod can communicate via localhost
2.2 Namespace Cleanup
Namespace cleanup:
During Pod deletion:
1. CNI DEL command releases network resources
- Return IP address
- Delete interface
- Remove routing rules
2. Unmount bind mount from /var/run/netns/
3. Network namespace automatically deleted
3. Volume Mounts
3.1 Mount Types
containerd manages volumes through mount configuration in the OCI spec:
Mount types:
1. bind mount:
- Mount host file/directory into container
- Host and container share the same data
- Used for ConfigMap, Secret, emptyDir, etc.
2. tmpfs mount:
- Memory-based filesystem
- Data lost on container termination
- Used for /dev/shm, /run, etc.
3. Special filesystems:
- proc: /proc
- sysfs: /sys
- cgroup: /sys/fs/cgroup
- devpts: /dev/pts
3.2 Mount Propagation
Mount propagation options:
1. private:
- No mount event propagation
- Default
2. rprivate:
- Recursive private
3. shared:
- Bidirectional mount event propagation
- Mount on host -> visible in container
- Mount in container -> visible on host
4. rshared:
- Recursive shared
5. slave:
- Host -> container unidirectional propagation
- Useful for volume plugins
6. rslave:
- Recursive slave
Kubernetes usage:
- Controlled via MountPropagation field
- CSI drivers typically use Bidirectional (shared)
3.3 CRI Volume Processing
Volume processing via CRI:
kubelet adds mounts to OCI spec:
1. emptyDir:
- kubelet creates directory on host
- Passed to container as bind mount
2. hostPath:
- Direct bind mount of host path
3. ConfigMap/Secret:
- kubelet creates data on tmpfs
- Passed to container as bind mount
4. PersistentVolumeClaim:
- kubelet mounts volume via CSI driver
- Mounted path passed as bind mount
containerd's role:
- Reflect kubelet-prepared mount info in OCI spec
- runc performs the actual mount
4. Device Access
4.1 Device Mapping
Device access mechanism:
OCI spec devices section:
linux:
devices:
- path: "/dev/nvidia0"
type: "c"
major: 195
minor: 0
fileMode: 438
uid: 0
gid: 0
Cgroup device access control:
linux:
resources:
devices:
- allow: true
type: "c"
major: 195
access: "rwm"
4.2 GPU Support
GPU access (NVIDIA):
NVIDIA Container Toolkit integration:
1. nvidia-container-runtime-hook:
- Operates as OCI runtime hook
- Runs before container start
- Mounts NVIDIA driver libraries into container
- Adds GPU device nodes to container
2. CDI (Container Device Interface):
- Device vendor-neutral standard
- Define device specs in /etc/cdi/
- containerd reads CDI specs and reflects in OCI spec
CDI spec example:
cdiVersion: "0.5.0"
kind: "nvidia.com/gpu"
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: "/dev/nvidia0"
mounts:
- hostPath: "/usr/lib/x86_64-linux-gnu/libnvidia-ml.so"
containerPath: "/usr/lib/x86_64-linux-gnu/libnvidia-ml.so"
4.3 Other Devices
Other device access:
1. FPGA:
- Expose FPGA devices via CDI specs
- Vendor-specific device plugins
2. InfiniBand/RDMA:
- Map /dev/infiniband/* devices
- Share network device namespace
3. Serial/USB:
- Direct host device mapping
- Privileged mode or explicit device allowlist
5. SELinux Integration
5.1 SELinux Context
SELinux container security:
SELinux settings in OCI spec:
linux:
mountLabel: "system_u:object_r:container_file_t:s0:c1,c2"
processLabel: "system_u:system_r:container_t:s0:c1,c2"
Components:
- user: system_u
- role: system_r (process) / object_r (file)
- type: container_t (process) / container_file_t (file)
- level: s0:c1,c2 (MCS category)
MCS (Multi-Category Security):
- Assigns unique categories to each container
- Prevents access to other containers' files
- Isolation between host and container
5.2 SELinux Processing Flow
SELinux application:
1. kubelet determines Pod SELinux options
- securityContext.seLinuxOptions
- Automatic MCS label assignment
2. Passed to containerd via CRI
- processLabel: process security context
- mountLabel: file security context
3. containerd reflects in OCI spec
4. runc applies at execution:
- Apply SELinux label to process
- Apply SELinux label to rootfs
- Apply SELinux label to mounts
6. AppArmor Integration
6.1 AppArmor Profiles
AppArmor container security:
Default profile: cri-containerd.apparmor.d
Key rules:
- Filesystem access restrictions
deny /proc/kcore r,
deny /sys/firmware/** r,
- Network access control
- Capability restrictions
- Mount operation restrictions
Profile application:
OCI spec:
process:
apparmorProfile: "cri-containerd.apparmor.d"
6.2 Custom Profiles
Custom AppArmor profiles:
1. Install profile on host:
Place profile file in /etc/apparmor.d/
apparmor_parser -r /etc/apparmor.d/my-profile
2. Specify in Pod:
annotations:
container.apparmor.security.beta.kubernetes.io/app: localhost/my-profile
3. containerd reflects in OCI spec:
process:
apparmorProfile: "my-profile"
7. Seccomp Integration
7.1 Seccomp Profiles
Seccomp (Secure Computing):
Define allowed/blocked system calls:
Default action: SCMP_ACT_ERRNO (deny)
Allowed system calls example:
- read, write, open, close
- mmap, mprotect, munmap
- socket, connect, accept
- ...
Blocked system calls example:
- mount, umount (prevent container escape)
- reboot
- kexec_load
- ptrace (in some environments)
7.2 Seccomp Application
Seccomp profile application:
1. Kubernetes SecurityContext:
securityContext:
seccompProfile:
type: RuntimeDefault
2. RuntimeDefault profile:
- containerd/runc default Seccomp profile
- Blocks dangerous system calls
- Suitable for most workloads
3. Custom profile:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: "profiles/my-seccomp.json"
8. Summary
containerd networking and storage follows a delegation model through standard interfaces. Network configuration via CNI, mount management via OCI spec, device access via CDI, and security isolation via SELinux/AppArmor/Seccomp are the key pillars. This standards-based design allows containerd to flexibly integrate with various networking solutions and security modules.