- Published on
eBPF Fundamentals — Programs, Maps, and the World of the Verifier
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- How eBPF Changed the Kernel
- The Architecture at a Glance
- A Map of Program Types
- Maps: The Bridge Between Kernel and User Space
- The Verifier: Gatekeeper of the Kernel
- First Program in Practice: libbpf + CO-RE
- bpftool: The Swiss Army Knife of eBPF
- Choosing a Language and Framework
- Kernel Versions and Feature Matrix
- Debugging Tips
- Production Considerations
- A Learning Roadmap
- Pitfalls and Anti-Patterns
- Closing Thoughts
- References
Introduction
Imagine a requirement lands on your desk: "I want to see, in real time, which files a particular process is opening on a production server." Traditionally you had two options. Write a kernel module and load it, or attach a ptrace-based tool such as strace to the process. The former means accepting the risk of a kernel panic; the latter can slow the target process down by an order of magnitude or more. Neither is an easy choice in production.
eBPF (extended Berkeley Packet Filter) solves this dilemma at its root. Without recompiling the kernel or loading a module, you can attach verified, safe programs to almost any point inside the kernel and run them. Today, CNIs like Cilium, load balancers like Katran, security tools like Falco and Tetragon, and countless observability agents all run on top of eBPF.
In this post we will walk through the three core building blocks of eBPF — programs, maps, and the verifier — and then write a first program end to end using libbpf and CO-RE. This is the first article in a series that continues with an observability deep dive and a runtime security deep dive.
How eBPF Changed the Kernel
The era of kernel modules and its limits
The traditional way to extend Linux kernel behavior was the loadable kernel module (LKM). Modules are powerful but come with fatal drawbacks.
| Aspect | Kernel module (LKM) | eBPF program |
|---|---|---|
| Safety | A single bug can panic the kernel | Verifier proves safety before load |
| Isolation | Full access to kernel memory | Restricted access via helper functions only |
| Compatibility | Recompile for every kernel version | CO-RE lets one binary run across kernels |
| Deployment | Module signing and policy hurdles | Loaded via one syscall, controlled by capabilities |
| Termination | Infinite loops possible | Program termination must be statically proven |
| Learning cost | Broad kernel-internal API knowledge | A restricted C subset and a limited helper API |
The core idea of eBPF is: "the kernel verifies the code it is about to run, before running it." Just as JavaScript made the web programmable inside the browser sandbox, eBPF made the kernel programmable. The analogy "JavaScript for the kernel" is used for good reason.
From cBPF to eBPF
The original BPF was a small virtual machine designed in 1992 for tcpdump packet filtering (classic BPF, cBPF). With the extended eBPF introduced in kernel 3.18 (2014), the following changed:
- Registers grew from 2 to 11 (R0 through R10) and were widened to 64 bits
- Maps were introduced as persistent data structures shared between kernel and user space
- Helper function calls gave safe access to kernel functionality
- A JIT compiler delivered near-native performance
- The scope expanded beyond networking into tracing, security, and general-purpose execution
The Architecture at a Glance
The journey of an eBPF program from source code to execution inside the kernel looks like this.
User space
+--------------------------------------------------------------+
| C source (.bpf.c) |
| | clang -target bpf -g (with BTF) |
| v |
| ELF object (.bpf.o) |
| | |
| v |
| Loader (libbpf / cilium-ebpf / aya / bpftool) |
| | applies CO-RE relocations, calls bpf() syscall |
+-----|--------------------------------------------------------+
v
Kernel space
+--------------------------------------------------------------+
| Verifier ──> safety analysis (pointers, bounds, halting) |
| | pass |
| v |
| JIT compiler ──> translates to native machine code |
| | |
| v |
| Attach to hook |
| ├── kprobe / kretprobe (kernel function entry/return) |
| ├── tracepoint (stable kernel events) |
| ├── XDP (right after NIC driver receive) |
| ├── tc (traffic control ingress/egress) |
| ├── LSM (security hooks) |
| └── cgroup (socket/syscall control) |
| |
| Maps <────> data shared with user space |
+--------------------------------------------------------------+
The flow in one sentence: code written in restricted C is compiled by clang into BPF bytecode, a loader hands it to the kernel via the bpf() syscall, the verifier proves it safe, the JIT translates it to native code, and it is attached to the designated hook. The program and user space exchange data through maps.
A Map of Program Types
eBPF programs are categorized by where they attach (the hook), and each type sees a different context and a different set of allowed helpers. The types you will meet most often in practice:
| Program type | Attach point | Primary use | Notes |
|---|---|---|---|
| kprobe / kretprobe | Any kernel function entry/return | Tracing, debugging | Flexible, but functions can change across kernel versions |
| tracepoint | Predefined static kernel events | Stable tracing | Relatively stable ABI, the recommended starting point |
| fentry / fexit | Kernel function entry/return (BTF-based) | High-performance tracing | Lower overhead than kprobe, kernel 5.5 and later |
| XDP | Earliest point of the NIC receive path | DDoS mitigation, load balancing | Runs before sk_buff allocation, extremely fast |
| tc (clsact) | Traffic control ingress/egress | Packet mangling, policy | Handles egress too, full sk_buff access |
| LSM (BPF LSM) | Linux Security Module hooks | Runtime security policy | Can deny operations, kernel 5.7 and later |
| cgroup family | Per-cgroup socket/syscall hooks | Per-container network policy | Pairs naturally with containers |
| uprobe / USDT | User-space functions / static probes | Application tracing | Trace library functions and app internals |
| perf_event | Timer/PMU events | CPU profiling | Sampling-based analysis |
A quick selection guide: prefer tracepoints when stability matters, kprobe or fentry when you need a precise spot inside the kernel, XDP when packets must be handled as early as possible, and LSM when the goal is policy enforcement rather than observation.
Maps: The Bridge Between Kernel and User Space
Maps are key-value data structures where eBPF programs keep state and exchange data with user space. The map type you choose shapes both the performance and the architecture of your program.
| Map type | Structure | Typical use |
|---|---|---|
| BPF_MAP_TYPE_HASH | Hash table | Arbitrary-key lookups: per-PID stats, connection tracking |
| BPF_MAP_TYPE_ARRAY | Fixed-size array | Configuration values, index-based counters |
| BPF_MAP_TYPE_PERCPU_HASH | Per-CPU hash | Contention-free high-frequency counting |
| BPF_MAP_TYPE_PERCPU_ARRAY | Per-CPU array | Histogram buckets, hot-path stats |
| BPF_MAP_TYPE_LRU_HASH | LRU hash | Auto-evicts old entries when full |
| BPF_MAP_TYPE_RINGBUF | Ring buffer (MPSC) | Kernel-to-user event streaming (5.8 and later, recommended) |
| BPF_MAP_TYPE_PERF_EVENT_ARRAY | Per-CPU perf buffers | Event delivery on older kernels |
| BPF_MAP_TYPE_LPM_TRIE | Longest-prefix match | IP CIDR matching |
| BPF_MAP_TYPE_PROG_ARRAY | Program array | Program chaining via tail calls |
| BPF_MAP_TYPE_SK_STORAGE | Socket-local storage | Per-socket metadata |
Two practical tips worth emphasizing:
- Prefer the ring buffer for event delivery. Unlike the perf buffer, it preserves event ordering across CPUs, uses memory more efficiently, and has a simpler user-space API.
- Use per-CPU maps for hot-path counters. When multiple CPUs update a regular hash map concurrently, atomic operations get expensive; per-CPU maps give each CPU an independent slot with zero contention. Aggregation happens in user space at read time.
In the modern libbpf style, maps are declared with BTF-based section annotations:
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 10240);
__type(key, __u32); /* PID */
__type(value, __u64); /* call count */
} call_count SEC(".maps");
The Verifier: Gatekeeper of the Kernel
The verifier is the component responsible for eBPF safety. At load time it statically analyzes every execution path and proves:
- The program always terminates (no unbounded loops)
- Every memory access stays within verified bounds
- No uninitialized register is ever read
- Only helpers permitted for this program type are called
- Pointer arithmetic never escapes a safe range
Bounds checking: how the verifier sees the world
The verifier tracks the possible value range of every register. The canonical pattern in an XDP program that reads packet data looks like this:
SEC("xdp")
int xdp_prog(struct xdp_md *ctx)
{
void *data = (void *)(long)ctx->data;
void *data_end = (void *)(long)ctx->data_end;
struct ethhdr *eth = data;
/* without this bounds check, the verifier rejects the load */
if ((void *)(eth + 1) > data_end)
return XDP_PASS;
/* only after the check is eth->h_proto access allowed */
if (eth->h_proto == bpf_htons(ETH_P_IP))
return XDP_DROP;
return XDP_PASS;
}
The moment you bound-check with an if statement, the verifier "learns" that the pointer is safe within that branch. Because compiler optimizations can remove or reorder such checks, keep bounds-checking code simple and direct.
Loop constraints
Early eBPF did not allow loops at all. Restrictions have been relaxed in stages:
| Technique | Kernel version | Description |
|---|---|---|
| pragma unroll | All versions | Unrolls the loop at compile time, fixed trip count required |
| Bounded loops | 5.3 and later | Loops allowed when the verifier can prove termination |
| bpf_loop helper | 5.17 and later | Callback-based loop, good for large iteration counts |
| Open-coded iterators | 6.4 and later | Iterator-based loops such as the bpf_for macro |
Common verification failures and fixes
| Error message (gist) | Cause | Fix |
|---|---|---|
| invalid mem access | Pointer dereference without bounds check | Add an explicit bounds check before access |
| unbounded loop detected | Loop whose termination cannot be proven | Cap the iteration count, or use bpf_loop |
| BPF program is too large | Instruction limit exceeded | Split logic, use tail calls; limit is one million since 5.2 |
| stack limit exceeded | More than 512 bytes of stack | Use a per-CPU array as scratch space for large structs |
| R1 type=ctx expected=fp | Misuse of the context pointer | Access context fields only in the sanctioned way |
| helper call is not allowed | Helper forbidden for this program type | Check which helpers your program type may call |
Verifier logs are famously long and cryptic, but if you follow the register state dump just before the failure point, you can almost always find the cause. We cover enabling verbose loader logs in the debugging section.
First Program in Practice: libbpf + CO-RE
Time to build something real. The goal is a mini execsnoop: "trace every process executed system-wide (execve) and print its PID, parent PID, and command name."
What are CO-RE, vmlinux.h, and BTF?
The old bcc approach required installing clang and kernel headers on the target machine and compiling at runtime. CO-RE (Compile Once, Run Everywhere) eliminates that.
- BTF (BPF Type Format): a compact format in which the kernel embeds its own type information. If the file /sys/kernel/btf/vmlinux exists, your kernel has BTF enabled.
- vmlinux.h: a single header generated from kernel BTF containing every kernel type definition. It lets you reference kernel structs without any kernel-headers package.
- CO-RE relocations: at compile time, struct field accesses are recorded as relocation metadata; at load time, libbpf compares them against the running kernel BTF and fixes up field offsets. One binary therefore runs unchanged across kernels with different struct layouts.
Generate vmlinux.h with:
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
Kernel-side code (execsnoop.bpf.c)
// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_core_read.h>
#define TASK_COMM_LEN 16
struct event {
__u32 pid;
__u32 ppid;
char comm[TASK_COMM_LEN];
};
/* ring buffer for kernel -> user space event delivery */
struct {
__uint(type, BPF_MAP_TYPE_RINGBUF);
__uint(max_entries, 256 * 1024);
} events SEC(".maps");
SEC("tracepoint/syscalls/sys_enter_execve")
int handle_execve(struct trace_event_raw_sys_enter *ctx)
{
struct event *e;
struct task_struct *task;
/* reserve space for the event in the ring buffer */
e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
if (!e)
return 0;
e->pid = bpf_get_current_pid_tgid() >> 32;
/* read the parent PID from task_struct with a CO-RE macro */
task = (struct task_struct *)bpf_get_current_task();
e->ppid = BPF_CORE_READ(task, real_parent, tgid);
bpf_get_current_comm(&e->comm, sizeof(e->comm));
bpf_ringbuf_submit(e, 0);
return 0;
}
char LICENSE[] SEC("license") = "GPL";
Key points:
- The SEC macro declares the program type and attach point through the ELF section name.
- BPF_CORE_READ is the safe, CO-RE-relocated member access macro. It follows a nested pointer chain (task → real_parent → tgid) in one expression.
- The license declaration is mandatory. Without a GPL-compatible license, many helpers are unavailable.
User-side loader (execsnoop.c)
During the build, bpftool generates a skeleton header (execsnoop.skel.h) from the .bpf.o file. The skeleton wraps load, attach, and teardown in type-safe functions.
// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
#include <signal.h>
#include <bpf/libbpf.h>
#include "execsnoop.skel.h"
struct event {
__u32 pid;
__u32 ppid;
char comm[16];
};
static volatile sig_atomic_t exiting = 0;
static void sig_handler(int sig) { exiting = 1; }
static int handle_event(void *ctx, void *data, size_t len)
{
const struct event *e = data;
printf("%-8u %-8u %-16s\n", e->pid, e->ppid, e->comm);
return 0;
}
int main(void)
{
struct execsnoop_bpf *skel;
struct ring_buffer *rb;
int err;
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);
skel = execsnoop_bpf__open_and_load();
if (!skel) {
fprintf(stderr, "failed to load BPF skeleton\n");
return 1;
}
err = execsnoop_bpf__attach(skel);
if (err) {
fprintf(stderr, "failed to attach BPF program: %d\n", err);
goto cleanup;
}
rb = ring_buffer__new(bpf_map__fd(skel->maps.events),
handle_event, NULL, NULL);
if (!rb) {
err = -1;
goto cleanup;
}
printf("%-8s %-8s %-16s\n", "PID", "PPID", "COMM");
while (!exiting) {
err = ring_buffer__poll(rb, 100 /* ms */);
if (err == -EINTR) { err = 0; break; }
if (err < 0) break;
}
cleanup:
ring_buffer__free(rb);
execsnoop_bpf__destroy(skel);
return err < 0 ? 1 : 0;
}
Build and run
# 1. Generate vmlinux.h
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
# 2. Compile the kernel side (-g is required: emits BTF/CO-RE info)
clang -O2 -g -target bpf -D__TARGET_ARCH_x86 \
-c execsnoop.bpf.c -o execsnoop.bpf.o
# 3. Generate the skeleton header
bpftool gen skeleton execsnoop.bpf.o > execsnoop.skel.h
# 4. Compile and link the user side
clang -O2 -o execsnoop execsnoop.c -lbpf -lelf -lz
# 5. Run (requires root or the appropriate capabilities)
sudo ./execsnoop
Run ls or date in another terminal and you will see the PID, PPID, and command name appear immediately. The remarkable part: this tiny program observes every execve on the system with near-zero overhead. That is the power of eBPF.
bpftool: The Swiss Army Knife of eBPF
bpftool is the official CLI maintained in the kernel tree and is indispensable for inspecting loaded programs and maps.
# List all loaded programs
sudo bpftool prog show
# Disassemble the JIT output of a specific program
sudo bpftool prog dump jited id 42
# Inspect the translated (post-verifier) bytecode
sudo bpftool prog dump xlated id 42
# List maps and dump their contents
sudo bpftool map show
sudo bpftool map dump id 17
# Probe which eBPF features this system supports
sudo bpftool feature probe
# Show programs attached to the network (XDP, tc)
sudo bpftool net show
# Pin a program to bpffs so it survives loader exit
sudo bpftool prog pin id 42 /sys/fs/bpf/myprog
The feature probe deserves special mention: it shows at a glance which program types, map types, and helpers the kernel supports, which makes it the first command worth running on any new server.
Choosing a Language and Framework
eBPF itself is a kernel technology; the language you develop in is a separate decision.
| Framework | Language | Strengths | Weaknesses | Best fit |
|---|---|---|---|---|
| libbpf + C | C | Official in the kernel tree, newest features, CO-RE standard | C productivity, manual memory management | System tools, maximum performance and latest features |
| cilium/ebpf | Go | Go ecosystem integration, pure-Go loader | Kernel side still written in C | Kubernetes tooling, Go-based agents |
| aya | Rust | Rust on both kernel and user side, memory safety | Younger ecosystem | Rust teams, safety-focused projects |
| bcc | Python + C | Rich examples, fast prototyping | Runtime compilation, heavy dependencies | Learning, one-off analysis |
| bpftrace | Dedicated DSL | Instant analysis via one-liners | Limited for complex logic | Ad hoc production troubleshooting |
The recommended path is clear: start operational analysis with bpftrace, and build productized tooling with libbpf (C), cilium/ebpf (Go), or aya (Rust). The runtime-compilation model of bcc is no longer recommended for new projects now that CO-RE is ubiquitous.
Kernel Versions and Feature Matrix
What your target kernels support is the first thing to confirm at design time. The major milestones (within what I know for certain — the safe way to confirm support is bpftool feature probe):
| Kernel version | Major features |
|---|---|
| 3.18 | eBPF syscall introduced |
| 4.1 to 4.7 | Broader kprobe, tc, tracepoint attachment support |
| 4.8 | XDP introduced |
| 4.18 | BTF begins to land |
| 5.2 | Instruction limit raised to one million, global variables |
| 5.3 | Bounded loops allowed |
| 5.5 | fentry / fexit (BTF-based trampolines) |
| 5.7 | BPF LSM, struct_ops |
| 5.8 | Ring buffer map, CAP_BPF capability split |
| 5.10 | Sleepable BPF programs |
| 5.17 | bpf_loop helper |
| 6.x | Open-coded iterators, more kfuncs, arenas, ongoing expansion |
As a practical baseline, RHEL 9 or Ubuntu 22.04 LTS and later comfortably cover ringbuf, CO-RE, and fentry. On mainstream distribution kernels in 2026 (5.14 and later), every example in this post works.
Debugging Tips
- Read the verifier log carefully. You can raise the verbosity through a libbpf environment variable or in code.
/* enable debug output before loading */
libbpf_set_print(libbpf_print_fn); /* prints up to LIBBPF_DEBUG level */
- Use bpf_printk for printf-style debugging on the kernel side. Read the output from trace_pipe.
bpf_printk("pid=%d comm=%s", pid, comm);
sudo cat /sys/kernel/debug/tracing/trace_pipe
- When compiler optimizations remove your bounds checks, work around it with volatile or the barrier_var macro.
- Compare the xlated dump with your source. The bpftool prog dump xlated output shows the instructions the verifier actually saw, often revealing the gap between "the code you wrote" and "the code that was verified."
- Start small and grow incrementally. Do not try to push a large program through verification at once; load a minimal version first, then add logic.
Production Considerations
Overhead
eBPF is light, but not free. Most of the cost scales with how frequently the hook fires.
- Attaching a kprobe to a function called millions of times per second (for example, a scheduler hot path) can accumulate visible overhead. fentry is cheaper than kprobe, so prefer fentry when available.
- To reduce map update cost, use per-CPU maps and an aggregate-then-deliver pattern (aggregate in the kernel, have user space read only periodically).
- At points where events can flood, sample or filter on the kernel side to cut the volume sent through the ring buffer in the first place.
Capabilities: CAP_BPF and friends
Since kernel 5.8, eBPF privileges have been split out of CAP_SYS_ADMIN.
| Capability | Scope |
|---|---|
| CAP_BPF | Basic bpf() syscall use (map creation, some program loads) |
| CAP_PERFMON | Attaching tracing programs, reading kernel memory |
| CAP_NET_ADMIN | Attaching network programs such as XDP and tc |
| CAP_SYS_ADMIN | Legacy catch-all that includes the above (avoid) |
When deploying an observability agent, the best practice is least privilege with CAP_BPF plus CAP_PERFMON, adding CAP_NET_ADMIN only when network programs are needed. Also note that unprivileged BPF is disabled on most distributions for security reasons (the sysctl kernel.unprivileged_bpf_disabled), and it should stay that way.
Lifecycle management
- Programs and maps are freed when the last file descriptor reference disappears. To keep them alive independently of the loader process, pin them to bpffs.
- Thanks to CO-RE, kernel upgrades rarely require a rebuild, but kprobe target functions can vanish or be renamed, so build fallback logic for attachment failures into your agent.
A Learning Roadmap
- Stage 1 — start as a user: build intuition by using bpftrace one-liners and bcc tools (execsnoop, opensnoop, biolatency) to analyze real machines.
- Stage 2 — read: go through the What is eBPF material on ebpf.io and the kernel BPF documentation to form the big picture of program types and maps.
- Stage 3 — first program: write a libbpf program with the tracepoint plus ringbuf combination, like the execsnoop in this post. The libbpf-bootstrap repository is an excellent starting template.
- Stage 4 — branch out: depending on your interests, expand into XDP (networking), uprobe/USDT (applications), or LSM (security).
- Stage 5 — productionize: cover the CO-RE compatibility matrix, capability design, overhead measurement, and multi-kernel testing in CI.
Pitfalls and Anti-Patterns
- Anti-pattern 1: depending on kprobes attached to unstable kernel-internal functions. A minor kernel update can inline or rename the function and silently break your tool. Defend with tracepoints, or fentry plus a BTF existence check.
- Anti-pattern 2: ignoring undersized maps. When a hash map fills up, updates fail and data is lost. Use an LRU map or keep a separate failure counter.
- Anti-pattern 3: slow ring buffer consumers. If the user-space consumer lags, reserve calls fail and events vanish. Always maintain a drop counter and monitor it.
- Anti-pattern 4: contorting code just to pass verification. Patterns that "trick" the verifier break when the kernel or compiler version changes. Explicit bounds checks and simple control flow are the real answer.
- Anti-pattern 5: missing license declaration. Without a GPL declaration, many helpers are blocked. If you deliberately choose a non-GPL license, check the list of available helpers first.
Closing Thoughts
eBPF turned something that would have been considered impossible a decade ago — "changing kernel behavior without changing the kernel" — into an everyday practice. The essence comes down to three things: programs that attach to hooks, maps that connect the data, and the verifier that guarantees safety. Once you understand how these three relate, you can reason about the internals of any eBPF tool you encounter.
In the next post we will build on this foundation and open the black box of a live system with bpftrace and the BCC tools, and after that we will look at building runtime security with Tetragon, Falco, and BPF LSM.
References
- ebpf.io — What is eBPF?
- Linux kernel BPF documentation
- BPF verifier documentation (kernel.org)
- BTF (BPF Type Format) documentation (kernel.org)
- libbpf repository (GitHub)
- libbpf-bootstrap — libbpf example templates
- BPF CO-RE reference guide (Andrii Nakryiko)
- bpftool documentation (kernel.org)
- cilium/ebpf — Go eBPF library
- aya — Rust eBPF library
- BCC (BPF Compiler Collection)
- Brendan Gregg — eBPF tracing resources