eBPF Fundamentals — Programs, Maps, and the World of the Verifier

Introduction
How eBPF Changed the Kernel
- The era of kernel modules and its limits
- From cBPF to eBPF
The Architecture at a Glance
A Map of Program Types
Maps: The Bridge Between Kernel and User Space
The Verifier: Gatekeeper of the Kernel
First Program in Practice: libbpf + CO-RE
bpftool: The Swiss Army Knife of eBPF
Choosing a Language and Framework
Kernel Versions and Feature Matrix
Debugging Tips
Production Considerations
A Learning Roadmap
Pitfalls and Anti-Patterns
Closing Thoughts
References

Introduction

Imagine a requirement lands on your desk: "I want to see, in real time, which files a particular process is opening on a production server." Traditionally you had two options. Write a kernel module and load it, or attach a ptrace-based tool such as strace to the process. The former means accepting the risk of a kernel panic; the latter can slow the target process down by an order of magnitude or more. Neither is an easy choice in production.

eBPF (extended Berkeley Packet Filter) solves this dilemma at its root. Without recompiling the kernel or loading a module, you can attach verified, safe programs to almost any point inside the kernel and run them. Today, CNIs like Cilium, load balancers like Katran, security tools like Falco and Tetragon, and countless observability agents all run on top of eBPF.

In this post we will walk through the three core building blocks of eBPF — programs, maps, and the verifier — and then write a first program end to end using libbpf and CO-RE. This is the first article in a series that continues with an observability deep dive and a runtime security deep dive.

How eBPF Changed the Kernel

The era of kernel modules and its limits

The traditional way to extend Linux kernel behavior was the loadable kernel module (LKM). Modules are powerful but come with fatal drawbacks.

Aspect	Kernel module (LKM)	eBPF program
Safety	A single bug can panic the kernel	Verifier proves safety before load
Isolation	Full access to kernel memory	Restricted access via helper functions only
Compatibility	Recompile for every kernel version	CO-RE lets one binary run across kernels
Deployment	Module signing and policy hurdles	Loaded via one syscall, controlled by capabilities
Termination	Infinite loops possible	Program termination must be statically proven
Learning cost	Broad kernel-internal API knowledge	A restricted C subset and a limited helper API

The core idea of eBPF is: "the kernel verifies the code it is about to run, before running it." Just as JavaScript made the web programmable inside the browser sandbox, eBPF made the kernel programmable. The analogy "JavaScript for the kernel" is used for good reason.

From cBPF to eBPF

The original BPF was a small virtual machine designed in 1992 for tcpdump packet filtering (classic BPF, cBPF). With the extended eBPF introduced in kernel 3.18 (2014), the following changed:

Registers grew from 2 to 11 (R0 through R10) and were widened to 64 bits
Maps were introduced as persistent data structures shared between kernel and user space
Helper function calls gave safe access to kernel functionality
A JIT compiler delivered near-native performance
The scope expanded beyond networking into tracing, security, and general-purpose execution

The Architecture at a Glance

The journey of an eBPF program from source code to execution inside the kernel looks like this.

User space
+--------------------------------------------------------------+
|  C source (.bpf.c)                                           |
|     |  clang -target bpf -g (with BTF)                       |
|     v                                                        |
|  ELF object (.bpf.o)                                         |
|     |                                                        |
|     v                                                        |
|  Loader (libbpf / cilium-ebpf / aya / bpftool)               |
|     |  applies CO-RE relocations, calls bpf() syscall        |
+-----|--------------------------------------------------------+
      v
Kernel space
+--------------------------------------------------------------+
|  Verifier ──> safety analysis (pointers, bounds, halting)    |
|     |  pass                                                  |
|     v                                                        |
|  JIT compiler ──> translates to native machine code          |
|     |                                                        |
|     v                                                        |
|  Attach to hook                                              |
|   ├── kprobe / kretprobe (kernel function entry/return)      |
|   ├── tracepoint (stable kernel events)                      |
|   ├── XDP (right after NIC driver receive)                   |
|   ├── tc (traffic control ingress/egress)                    |
|   ├── LSM (security hooks)                                   |
|   └── cgroup (socket/syscall control)                        |
|                                                              |
|  Maps <────> data shared with user space                     |
+--------------------------------------------------------------+

The flow in one sentence: code written in restricted C is compiled by clang into BPF bytecode, a loader hands it to the kernel via the bpf() syscall, the verifier proves it safe, the JIT translates it to native code, and it is attached to the designated hook. The program and user space exchange data through maps.

A Map of Program Types

eBPF programs are categorized by where they attach (the hook), and each type sees a different context and a different set of allowed helpers. The types you will meet most often in practice:

Program type	Attach point	Primary use	Notes
kprobe / kretprobe	Any kernel function entry/return	Tracing, debugging	Flexible, but functions can change across kernel versions
tracepoint	Predefined static kernel events	Stable tracing	Relatively stable ABI, the recommended starting point
fentry / fexit	Kernel function entry/return (BTF-based)	High-performance tracing	Lower overhead than kprobe, kernel 5.5 and later
XDP	Earliest point of the NIC receive path	DDoS mitigation, load balancing	Runs before sk_buff allocation, extremely fast
tc (clsact)	Traffic control ingress/egress	Packet mangling, policy	Handles egress too, full sk_buff access
LSM (BPF LSM)	Linux Security Module hooks	Runtime security policy	Can deny operations, kernel 5.7 and later
cgroup family	Per-cgroup socket/syscall hooks	Per-container network policy	Pairs naturally with containers
uprobe / USDT	User-space functions / static probes	Application tracing	Trace library functions and app internals
perf_event	Timer/PMU events	CPU profiling	Sampling-based analysis

A quick selection guide: prefer tracepoints when stability matters, kprobe or fentry when you need a precise spot inside the kernel, XDP when packets must be handled as early as possible, and LSM when the goal is policy enforcement rather than observation.

Maps: The Bridge Between Kernel and User Space

Maps are key-value data structures where eBPF programs keep state and exchange data with user space. The map type you choose shapes both the performance and the architecture of your program.

Map type	Structure	Typical use
BPF_MAP_TYPE_HASH	Hash table	Arbitrary-key lookups: per-PID stats, connection tracking
BPF_MAP_TYPE_ARRAY	Fixed-size array	Configuration values, index-based counters
BPF_MAP_TYPE_PERCPU_HASH	Per-CPU hash	Contention-free high-frequency counting
BPF_MAP_TYPE_PERCPU_ARRAY	Per-CPU array	Histogram buckets, hot-path stats
BPF_MAP_TYPE_LRU_HASH	LRU hash	Auto-evicts old entries when full
BPF_MAP_TYPE_RINGBUF	Ring buffer (MPSC)	Kernel-to-user event streaming (5.8 and later, recommended)
BPF_MAP_TYPE_PERF_EVENT_ARRAY	Per-CPU perf buffers	Event delivery on older kernels
BPF_MAP_TYPE_LPM_TRIE	Longest-prefix match	IP CIDR matching
BPF_MAP_TYPE_PROG_ARRAY	Program array	Program chaining via tail calls
BPF_MAP_TYPE_SK_STORAGE	Socket-local storage	Per-socket metadata

Two practical tips worth emphasizing:

Prefer the ring buffer for event delivery. Unlike the perf buffer, it preserves event ordering across CPUs, uses memory more efficiently, and has a simpler user-space API.
Use per-CPU maps for hot-path counters. When multiple CPUs update a regular hash map concurrently, atomic operations get expensive; per-CPU maps give each CPU an independent slot with zero contention. Aggregation happens in user space at read time.

In the modern libbpf style, maps are declared with BTF-based section annotations:

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 10240);
    __type(key, __u32);     /* PID */
    __type(value, __u64);   /* call count */
} call_count SEC(".maps");

The Verifier: Gatekeeper of the Kernel

The verifier is the component responsible for eBPF safety. At load time it statically analyzes every execution path and proves:

The program always terminates (no unbounded loops)
Every memory access stays within verified bounds
No uninitialized register is ever read
Only helpers permitted for this program type are called
Pointer arithmetic never escapes a safe range

Bounds checking: how the verifier sees the world

The verifier tracks the possible value range of every register. The canonical pattern in an XDP program that reads packet data looks like this:

SEC("xdp")
int xdp_prog(struct xdp_md *ctx)
{
    void *data     = (void *)(long)ctx->data;
    void *data_end = (void *)(long)ctx->data_end;
    struct ethhdr *eth = data;

    /* without this bounds check, the verifier rejects the load */
    if ((void *)(eth + 1) > data_end)
        return XDP_PASS;

    /* only after the check is eth->h_proto access allowed */
    if (eth->h_proto == bpf_htons(ETH_P_IP))
        return XDP_DROP;

    return XDP_PASS;
}

The moment you bound-check with an if statement, the verifier "learns" that the pointer is safe within that branch. Because compiler optimizations can remove or reorder such checks, keep bounds-checking code simple and direct.

Loop constraints

Early eBPF did not allow loops at all. Restrictions have been relaxed in stages:

Technique	Kernel version	Description
pragma unroll	All versions	Unrolls the loop at compile time, fixed trip count required
Bounded loops	5.3 and later	Loops allowed when the verifier can prove termination
bpf_loop helper	5.17 and later	Callback-based loop, good for large iteration counts
Open-coded iterators	6.4 and later	Iterator-based loops such as the bpf_for macro

Common verification failures and fixes

Error message (gist)	Cause	Fix
invalid mem access	Pointer dereference without bounds check	Add an explicit bounds check before access
unbounded loop detected	Loop whose termination cannot be proven	Cap the iteration count, or use bpf_loop
BPF program is too large	Instruction limit exceeded	Split logic, use tail calls; limit is one million since 5.2
stack limit exceeded	More than 512 bytes of stack	Use a per-CPU array as scratch space for large structs
R1 type=ctx expected=fp	Misuse of the context pointer	Access context fields only in the sanctioned way
helper call is not allowed	Helper forbidden for this program type	Check which helpers your program type may call

Verifier logs are famously long and cryptic, but if you follow the register state dump just before the failure point, you can almost always find the cause. We cover enabling verbose loader logs in the debugging section.

First Program in Practice: libbpf + CO-RE

Time to build something real. The goal is a mini execsnoop: "trace every process executed system-wide (execve) and print its PID, parent PID, and command name."

What are CO-RE, vmlinux.h, and BTF?

The old bcc approach required installing clang and kernel headers on the target machine and compiling at runtime. CO-RE (Compile Once, Run Everywhere) eliminates that.

BTF (BPF Type Format): a compact format in which the kernel embeds its own type information. If the file /sys/kernel/btf/vmlinux exists, your kernel has BTF enabled.
vmlinux.h: a single header generated from kernel BTF containing every kernel type definition. It lets you reference kernel structs without any kernel-headers package.
CO-RE relocations: at compile time, struct field accesses are recorded as relocation metadata; at load time, libbpf compares them against the running kernel BTF and fixes up field offsets. One binary therefore runs unchanged across kernels with different struct layouts.

Generate vmlinux.h with:

bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

Kernel-side code (execsnoop.bpf.c)

// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_core_read.h>

#define TASK_COMM_LEN 16

struct event {
    __u32 pid;
    __u32 ppid;
    char comm[TASK_COMM_LEN];
};

/* ring buffer for kernel -> user space event delivery */
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} events SEC(".maps");

SEC("tracepoint/syscalls/sys_enter_execve")
int handle_execve(struct trace_event_raw_sys_enter *ctx)
{
    struct event *e;
    struct task_struct *task;

    /* reserve space for the event in the ring buffer */
    e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (!e)
        return 0;

    e->pid = bpf_get_current_pid_tgid() >> 32;

    /* read the parent PID from task_struct with a CO-RE macro */
    task = (struct task_struct *)bpf_get_current_task();
    e->ppid = BPF_CORE_READ(task, real_parent, tgid);

    bpf_get_current_comm(&e->comm, sizeof(e->comm));

    bpf_ringbuf_submit(e, 0);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Key points:

The SEC macro declares the program type and attach point through the ELF section name.
BPF_CORE_READ is the safe, CO-RE-relocated member access macro. It follows a nested pointer chain (task → real_parent → tgid) in one expression.
The license declaration is mandatory. Without a GPL-compatible license, many helpers are unavailable.

User-side loader (execsnoop.c)

During the build, bpftool generates a skeleton header (execsnoop.skel.h) from the .bpf.o file. The skeleton wraps load, attach, and teardown in type-safe functions.

// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
#include <signal.h>
#include <bpf/libbpf.h>
#include "execsnoop.skel.h"

struct event {
    __u32 pid;
    __u32 ppid;
    char comm[16];
};

static volatile sig_atomic_t exiting = 0;

static void sig_handler(int sig) { exiting = 1; }

static int handle_event(void *ctx, void *data, size_t len)
{
    const struct event *e = data;
    printf("%-8u %-8u %-16s\n", e->pid, e->ppid, e->comm);
    return 0;
}

int main(void)
{
    struct execsnoop_bpf *skel;
    struct ring_buffer *rb;
    int err;

    signal(SIGINT, sig_handler);
    signal(SIGTERM, sig_handler);

    skel = execsnoop_bpf__open_and_load();
    if (!skel) {
        fprintf(stderr, "failed to load BPF skeleton\n");
        return 1;
    }

    err = execsnoop_bpf__attach(skel);
    if (err) {
        fprintf(stderr, "failed to attach BPF program: %d\n", err);
        goto cleanup;
    }

    rb = ring_buffer__new(bpf_map__fd(skel->maps.events),
                          handle_event, NULL, NULL);
    if (!rb) {
        err = -1;
        goto cleanup;
    }

    printf("%-8s %-8s %-16s\n", "PID", "PPID", "COMM");
    while (!exiting) {
        err = ring_buffer__poll(rb, 100 /* ms */);
        if (err == -EINTR) { err = 0; break; }
        if (err < 0) break;
    }

cleanup:
    ring_buffer__free(rb);
    execsnoop_bpf__destroy(skel);
    return err < 0 ? 1 : 0;
}

Build and run

# 1. Generate vmlinux.h
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

# 2. Compile the kernel side (-g is required: emits BTF/CO-RE info)
clang -O2 -g -target bpf -D__TARGET_ARCH_x86 \
      -c execsnoop.bpf.c -o execsnoop.bpf.o

# 3. Generate the skeleton header
bpftool gen skeleton execsnoop.bpf.o > execsnoop.skel.h

# 4. Compile and link the user side
clang -O2 -o execsnoop execsnoop.c -lbpf -lelf -lz

# 5. Run (requires root or the appropriate capabilities)
sudo ./execsnoop

Run ls or date in another terminal and you will see the PID, PPID, and command name appear immediately. The remarkable part: this tiny program observes every execve on the system with near-zero overhead. That is the power of eBPF.

bpftool: The Swiss Army Knife of eBPF

bpftool is the official CLI maintained in the kernel tree and is indispensable for inspecting loaded programs and maps.

# List all loaded programs
sudo bpftool prog show

# Disassemble the JIT output of a specific program
sudo bpftool prog dump jited id 42

# Inspect the translated (post-verifier) bytecode
sudo bpftool prog dump xlated id 42

# List maps and dump their contents
sudo bpftool map show
sudo bpftool map dump id 17

# Probe which eBPF features this system supports
sudo bpftool feature probe

# Show programs attached to the network (XDP, tc)
sudo bpftool net show

# Pin a program to bpffs so it survives loader exit
sudo bpftool prog pin id 42 /sys/fs/bpf/myprog

The feature probe deserves special mention: it shows at a glance which program types, map types, and helpers the kernel supports, which makes it the first command worth running on any new server.

Choosing a Language and Framework

eBPF itself is a kernel technology; the language you develop in is a separate decision.

Framework	Language	Strengths	Weaknesses	Best fit
libbpf + C	C	Official in the kernel tree, newest features, CO-RE standard	C productivity, manual memory management	System tools, maximum performance and latest features
cilium/ebpf	Go	Go ecosystem integration, pure-Go loader	Kernel side still written in C	Kubernetes tooling, Go-based agents
aya	Rust	Rust on both kernel and user side, memory safety	Younger ecosystem	Rust teams, safety-focused projects
bcc	Python + C	Rich examples, fast prototyping	Runtime compilation, heavy dependencies	Learning, one-off analysis
bpftrace	Dedicated DSL	Instant analysis via one-liners	Limited for complex logic	Ad hoc production troubleshooting

The recommended path is clear: start operational analysis with bpftrace, and build productized tooling with libbpf (C), cilium/ebpf (Go), or aya (Rust). The runtime-compilation model of bcc is no longer recommended for new projects now that CO-RE is ubiquitous.

Kernel Versions and Feature Matrix

What your target kernels support is the first thing to confirm at design time. The major milestones (within what I know for certain — the safe way to confirm support is bpftool feature probe):

Kernel version	Major features
3.18	eBPF syscall introduced
4.1 to 4.7	Broader kprobe, tc, tracepoint attachment support
4.8	XDP introduced
4.18	BTF begins to land
5.2	Instruction limit raised to one million, global variables
5.3	Bounded loops allowed
5.5	fentry / fexit (BTF-based trampolines)
5.7	BPF LSM, struct_ops
5.8	Ring buffer map, CAP_BPF capability split
5.10	Sleepable BPF programs
5.17	bpf_loop helper
6.x	Open-coded iterators, more kfuncs, arenas, ongoing expansion

As a practical baseline, RHEL 9 or Ubuntu 22.04 LTS and later comfortably cover ringbuf, CO-RE, and fentry. On mainstream distribution kernels in 2026 (5.14 and later), every example in this post works.

Debugging Tips

Read the verifier log carefully. You can raise the verbosity through a libbpf environment variable or in code.

/* enable debug output before loading */
libbpf_set_print(libbpf_print_fn);  /* prints up to LIBBPF_DEBUG level */

Use bpf_printk for printf-style debugging on the kernel side. Read the output from trace_pipe.

bpf_printk("pid=%d comm=%s", pid, comm);

sudo cat /sys/kernel/debug/tracing/trace_pipe

When compiler optimizations remove your bounds checks, work around it with volatile or the barrier_var macro.
Compare the xlated dump with your source. The bpftool prog dump xlated output shows the instructions the verifier actually saw, often revealing the gap between "the code you wrote" and "the code that was verified."
Start small and grow incrementally. Do not try to push a large program through verification at once; load a minimal version first, then add logic.

Production Considerations

Overhead

eBPF is light, but not free. Most of the cost scales with how frequently the hook fires.

Attaching a kprobe to a function called millions of times per second (for example, a scheduler hot path) can accumulate visible overhead. fentry is cheaper than kprobe, so prefer fentry when available.
To reduce map update cost, use per-CPU maps and an aggregate-then-deliver pattern (aggregate in the kernel, have user space read only periodically).
At points where events can flood, sample or filter on the kernel side to cut the volume sent through the ring buffer in the first place.

Capabilities: CAP_BPF and friends

Since kernel 5.8, eBPF privileges have been split out of CAP_SYS_ADMIN.

Capability	Scope
CAP_BPF	Basic bpf() syscall use (map creation, some program loads)
CAP_PERFMON	Attaching tracing programs, reading kernel memory
CAP_NET_ADMIN	Attaching network programs such as XDP and tc
CAP_SYS_ADMIN	Legacy catch-all that includes the above (avoid)

When deploying an observability agent, the best practice is least privilege with CAP_BPF plus CAP_PERFMON, adding CAP_NET_ADMIN only when network programs are needed. Also note that unprivileged BPF is disabled on most distributions for security reasons (the sysctl kernel.unprivileged_bpf_disabled), and it should stay that way.

Lifecycle management

Programs and maps are freed when the last file descriptor reference disappears. To keep them alive independently of the loader process, pin them to bpffs.
Thanks to CO-RE, kernel upgrades rarely require a rebuild, but kprobe target functions can vanish or be renamed, so build fallback logic for attachment failures into your agent.

A Learning Roadmap

Stage 1 — start as a user: build intuition by using bpftrace one-liners and bcc tools (execsnoop, opensnoop, biolatency) to analyze real machines.
Stage 2 — read: go through the What is eBPF material on ebpf.io and the kernel BPF documentation to form the big picture of program types and maps.
Stage 3 — first program: write a libbpf program with the tracepoint plus ringbuf combination, like the execsnoop in this post. The libbpf-bootstrap repository is an excellent starting template.
Stage 4 — branch out: depending on your interests, expand into XDP (networking), uprobe/USDT (applications), or LSM (security).
Stage 5 — productionize: cover the CO-RE compatibility matrix, capability design, overhead measurement, and multi-kernel testing in CI.

Pitfalls and Anti-Patterns

Anti-pattern 1: depending on kprobes attached to unstable kernel-internal functions. A minor kernel update can inline or rename the function and silently break your tool. Defend with tracepoints, or fentry plus a BTF existence check.
Anti-pattern 2: ignoring undersized maps. When a hash map fills up, updates fail and data is lost. Use an LRU map or keep a separate failure counter.
Anti-pattern 3: slow ring buffer consumers. If the user-space consumer lags, reserve calls fail and events vanish. Always maintain a drop counter and monitor it.
Anti-pattern 4: contorting code just to pass verification. Patterns that "trick" the verifier break when the kernel or compiler version changes. Explicit bounds checks and simple control flow are the real answer.
Anti-pattern 5: missing license declaration. Without a GPL declaration, many helpers are blocked. If you deliberately choose a non-GPL license, check the list of available helpers first.

Closing Thoughts

eBPF turned something that would have been considered impossible a decade ago — "changing kernel behavior without changing the kernel" — into an everyday practice. The essence comes down to three things: programs that attach to hooks, maps that connect the data, and the verifier that guarantees safety. Once you understand how these three relate, you can reason about the internals of any eBPF tool you encounter.

In the next post we will build on this foundation and open the black box of a live system with bpftrace and the BCC tools, and after that we will look at building runtime security with Tetragon, Falco, and BPF LSM.